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Experiment II, subject by consonant by 
vowel by intensity interaction - labial consonant.89 


Experiment II, subject by consonant by 
vowel by intensity interaction - dental consonant.90 
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What is the difference between a stop and a fricative? 
Eneterms™ofvarticulation, the difference ws that© stops 
mvelve a complete closure in the vocal tract while tric— 
atives are produced by narrowing the vocal tract at some 
point. But what are the acoustic differences relevant to 
StOpmeandmiricative identifications The answer asenoteas 
clear. Perceptual studies of the distinction between stops 
and fricatives in English are not available. Many papers on 
the acoustic properties of fricatives exist (Carney & Moll, 
1971p Delattre, Liberman,” & Gooper, 1962; "Fujisaki & Osamu, 
1978; Harris, 1958; Heinz & Stevens, 1961; Hughes & Halle, 
1956; Jassem, 1965; Lacerda, 1982; Lariviere, Winitz, & 
Herriman, 1975; McCasland, 1979; Strevens, 1960). There are 
also many studies of the acoustic properties of stops (for 
example, Blumstein & Stevens, 1980; Dorman, 
Studdert-Kennedy, & Raphael, 1977; Fischer-Jorgenson, 1954; 
Halle, Hughes, & Radley, 1957). As well, the differences 
between stops and glides (Miller & Liberman, 1979; Suzuki, 
1970), and the differences between fricatives and affricates 
have been investigated (Dorman, Raphael, & Isenberg, 1980; 
Gerstman, 1957; Howell & Rosen, 1983; Repp, Liberman, 
Eccardt, & Pesetsky, 1978; and Van Heuven, 1979). But few 
Studies have directly compared perceptual attributes of 
Emvcavivessandestops (Baker, (975;=tsenberg,  19/o;sMavecot, 
1968: Treon, 1970). This thesis investigates the perceptual 


effects of manipulating acoustic properties of some English 
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fricatives and stops. 

Mheepartictlarebngliashvconsonants. under consideration 
eter nes worcwinutialyyb/29/d/)9/y/ wand) / ey a0 The acoustac 
Properties ote the Noise: portion of theiconsonants. (stop 
burst or fricative frication) which were manipulated are 
duration, slope of onset (abrupt vs. gradual), and inten- 
Sity. The frequency spectra of the consonants and the 
formant transitions were not altered. In the literature 
review, the acoustic properties of stops and fricatives will 
be discussed. At this point, the four phonemes and the 
reasons for studying these four are considered. 

The English stops /b/ and /d/ and fricatives /d/ and 
/v/ were chosen because they are the only English stop/fric- 
ative pairs that have approximately the same places of 
artictt@ation: ~b/eand 7V/ aren labial’ Yd/ sand) /ey7 sare 
dental. 

Voiced consonants were chosen intentionally. A 
comparison of the voiceless fricatives and stops would be 
complicated by the long voice onset time (VOT) after the 
release of voiceless stops which does not appear after 
voiceless fricatives. Gerstman (1957) conducted an experi- 
ment in which the overall duration and onset time of the 
voiceless affricate /tJ/ was reduced. At short durations 
and abrupt onsets, considerably more’ vorced stopsewere heard 
than unvoiced stops, although the original stimulus conso- 
nant was unvoiced. Gerstman says that "the result is not 


Surprising when we consider that voiceless stops usually 
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show a considerable separation between friction and vowel 

while=voicedystops downot" (p. 51). Simce®there was no 
delay between the consonant and vowel in Gerstman's stimuli, 
his consonants were heard as voiced when perceived as stops. 
Voiced consonants were selected as stimuli for the present 
experiments to avoid this complication associated with 
voicing of stops: 

There 1S some question of the comparability of /b/ to 
UU sand ea mcON.0,e Blithe ncase Ohm b/a/v7, thesquescion us 
primarily one of difference in place of articulation: /b/ is 
produced bilabially and /v/ is labiodental. But, since both 
are labial, their consonant-vowel formant transitions should 
be similar. The non-existence of an English bilabial 
fricative phoneme and a labiodental stop phoneme indicates 
that /b/ and /v/ are the phones to be compared, even though 
their places of articulation are not identical. 

Some investigators regard the alveolar fricatives /s/ 
andi 27 acmthescounterparce Gfetheestcps.,t/ and) /d;. -One 
reasom £0r ecomparinge/d/ and\/07 insteadio® /d/ and 727s 
histonicalm eAcCCOrding torwGrimm s law, thes’ ndo-Buropean 
bh) wdh, andighebecame, respectively, the Germanic sounds £, 
Cpe Gey ccm ate meee Nit 1a RRDOG it) OM ad Uaeled Sty eC 
(PylieseandeAlgeoren. 09). Also, Indo-Buropeanm/p/ ,a7t/7ew K/ 
became Germanic /f/, /@/, and /x/. Further, Verner's law 
claims that these Proto-Germanic voiceless fricatives became 
voiced in a voiced environment before a stressed syllable. 
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dhe-s/0/%-7/d/rand /t/ =—/0/ —/2/ confirm the assumption of 
a relationship between the dental fricatives and stops. 

ANOERewRIUStTIbyCatmon Lor pairing 6/ cy sande, das aoa is 
Similarity ein production. The phonemes /@/sand /e/ are 
ottenecalled “interdental”, buts eehoughs/0/ maybe 
Miterdental, 970/ certainly as nots Umeda s(1977)) noted that 
many /¢0/"S are articulated with a tap of the tongue against 
Enievalveolar@ridge, "an warticulatvon veryesimi tan to ethat of 
/o7maso i malagly~inepreliminary investigation, the author 
noted small bursts in the oscillograms of some tokens of 
/0/. Umeda described the articulation of /#/ as glide-like 
because it does not "hold a constant articulatory manner" as 
Gom/s7/ fand (27 #(ps (849) S"eThe author would callethe apace 
ulation of /d/ stop-like because of the brief occlusion of 
the vocal tract sometimes present in its production. Either 
way, it iS apparent that the articulation of /2/ resembles 
that of /d/ more closely than the articulation of /z/ does, 
both in place and manner of production. 

Substitutions made by children provide another argument 
EOmmtne, comparability of /d/ tande/0/s Sintchildvacquisitaon 
of English Ericatives, Moskowitz (1975) noted that /d/ is 
By pUcCaliywSuUbStETtUced tone, c/ sabealESt ,enlimner study eos 
eight children, aged 1:1 to 3:5, Moskowitz found that seven 
OUETOL Ebhemergnt produceds/d-~ ton 0/ at eleast sometimes: 
(Only the 3:5 year old had complete control over his /3/.) 
Moskowitz also notes that an older child may substitute /v/ 


for /d/ in imitation when he has to match "the acoustic 
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Charactenveticssoretheimodel he isetoramitate mands v7 
servesethe purpose; more Satisftactorily thany/d/ does” (p. 

WA )yGeeHowever, “for the child who does not yetvhavestotal 
mastery over /d/, and who also does not have /@/ or /z/, the 
eLOseStnsunSstitute! in anticulatwone2se /d/7.u in his 
spontaneous speech, he seems to be attempting to learn how 
tonchangesor moditys/d/sappropriately torarrivesateyo/” ((p. 
147). This evidence supports the statement that /3/ is 

neon lyasamilaréeitou/a/: 

A final piece of evidence supporting /d/-/d/ similarity 
is their occurrence in morphophonemic alternations in some 
languages. Danish exhibits opposition of strong and weak 
consonants (Jakobson, Fant, & Halle, 1969). Strong conso- 
nants occur at the beginning of monosyllabic words, weak 
consonants occur at the end. For example, the phoneme /d/ 
occurs as [d] word initially as in dag, "day"; however, word 
finally®/d/-is pronounced) [0] astin hadi [hao], “hate. 
Another language with such an alternation is Portuguese: 
inteuyvocalicallyny7d/suy by 1797 sbecomentol 7 be leeiy is 
Spanish also exhibits this allophonic variation. The 
historical sound changes of /d/ to /d/, the articulatory 
similarity of these phones, the substitution of /d/ for /d/ 
in language acquisition, and the occurrence of 
morphophonemic alternations between /d/ and /0/ are all 
evidence that /d/ and /a/ are similar and bear comparison. 

Finally, the status of /#/ in English must be discussed 


because of its peculiarity. The phoneme /d/ is relatively 
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rare in languages of the world. According to Ruhlen's 1976 
INVEDEORY mle OCOUrS ine shuoursofethess 00. languages 
Sunveyedo- only  S*percent.7 Inv English; /28/ is) unusualein 
Pildia he <OCSANOLWOCCULSINWINLtl al spOSiEVOnm Ine full ecomtent” 
wordse (Umeda, 1977)+5 It could be considered a word initial 
fUNncCtLON=word-alvophonerot=/d/se, Buteonether basissof the 
minimal pair test, it is an English phoneme. For example, 
the difference between the words "dough" and "though" is the 
WOPCSINIEtal SStOproretiricative.© Inmecntext, 7d, andiyo,/ may 
be distinguished, word initially, solely by linguistic 
factors; nevertheless, it is interesting to study the per- 
ceptual properties of /d/ because, although its word initial 
occurrence is restricted to function words, these function 
words occur very frequently. 

Dnearcictionaryernrequency count, donembyelrnka (1935)5 
/0/ was rated least frequent out of 23 English consonants. 
Since /0/ occurs in few words, especially few content words, 
it rates low in a count in which every word is considered 
Omiyeonce.= In thus count Ydy waseetghth, /by gwasetenth, and 
4y/ was fourteenth. French (1930) counted frequency of 
occurrence of phones in conversation, which meant that 
phones ine trequentily occurcringswords es such as function 
WOrdS) =" were counted as occurring: more often.) French's data 
was collected from extemporaneous telephone conversations. 
Tobias (1959) retranscribed French's data to a more standard 
transcription and analysed the relative occurrence of 


phonemes in the data. He found that /d/ ranked sixth most 
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frequent consonant, /d/ ranked third, /v/ thirteenth, and 
/b/ sixteenth in the retranscribed data. It is clear from a 
comparisonvot Trnkal seand) Frenen sustudies that althouch 0/7 
occurs in relatively few words, these words occur 
frequently. The frequent occurrence of /3/ can be 
attributed to the fact that many deictic words contain a 
yo/omeher ag thas, m@thate these: ahocsere theressithenra chant 

Other studies of phoneme usage also rank /d/ high in 
frequency, usually just slightly lower than /d/. Denes 
(1963) found /a/ to be the seventh most frequently spoken 
consonant.» According to Denes, /d/ as fourth most» frequent, 
poy ese twelithy ands /v/ 16 thirteenth. “Wordeinitially, 
however, /0/ is the most frequent consonant, /b/ is the 
seventh most frequent consonant, /d/ is ninth, and /v/ is 
nineteenth. Wang and Crawford (1960) compiled the results 
of ten frequency counts of English phonemes, including the 
Trnka and French surveys mentioned above. In nine of the 
ten, the tenth being Trnka's dictionary count, /0/ was 
between twelfth and sixth most frequent: in four studies it 
was sixth, in three it was seventh. In these nine studies 
/a/ was ranked between tenth and third, usually third, 
fourth, or fifth; /b/ always ranked between 12 and 17; and 
/v/ ranked between 10 and 17. The phonemes /b/ and /v/ 
occur approximately equally often in English. Likewise, /d/ 
and /d/ occur with similar frequency in English 
conversation. These equalities support the comparison of 


the labial and dental stops and fricatives. 
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Much is known about English stops and fricatives. But 
little is known about the perceptual relationship between 
the two types of consonants. The many aspects in which they 
are related encourages an investigation of the perception of 
these consonants. The intent of this thesis is to 
contribute to an understanding of the bases of recognition 
of the English voiced labial and dental stops and fric- 
atives. Although the question posited at the beginning of 
this chapter asked for the differences between stops and 
fricatives, the intent here is to illuminate not only the 
differences but also the similarities between /b/ and /v/ 
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Ilt Literature Review 


A. Spectral Properties 

Many spectrographic studies have been done on the 
properties of fricatives and stops. (Such studies include 
Baker, 1975; Blumstein & Stevens, 1980; Delattre, Liberman, 
EeCooper, P1955. Delattremet ale, o1962+ ehal te et fale 71 95i 
Heinz & Stevens, 1961; Hughes & Halle, 1956; Jassem, 1965; 
McCasland, 1979; and Strevens, 1960.) The spectral 
qualities of stops and fricatives have not often been com- 
pared, although some conjectures have been made. Strevens 
(1960) supposes that the spectrum of the burst of a 
voiceless stop would be like the spectrum of a closely 
homorganic fricative. Harris (1958) also suggests that the 
frication of a fricative is comparable spectrally to the 
burst of a stop. In comparing the spectrograms of 
abiricates sand) fricatives, Howel | "ands Rosen (1983) sfoundeno 
spectral differences. These studies imply comparability 
between fricatives and stops, at least with respect to the 
noise spectra. 

Another spectral consideration in the comparison of 
consonants, other than that of the noise portion, 1S the 
consonant to vowel transition. Most studies of formant 
transitions and formant transition loci have considered stop 
consonants, and ignored fricatives. Carney and Moll (1971), 
however, suggests that Ohman's (1966) observations apply to 


Pricati Ves rasnwel leascmotops: | VGV fareieudalvonsecanghe 


ai) 6 ane’ néed. — 
sbiinal-gsihuge daa) 
ehitete it jextietsa sna) 3n3 
«Tati >, fei se Gi bah: 4S4Fi 


ae = 
pene maeeny (968/  eltens Nee: ca a . 


igS3neG2 SAT 40ae! sala 


fo5 8a nadia gov svéd. ‘25 a3 Anal nlp 3 

sniiieos2 48am aiaed® svar Lira ? ; douedatie | 2 

§ ie s2 yun eh? se a byate eta “ants astoqave. Uf Uy 

isgeci¢z s to Mizahege 343 alert Sd er ole sanbwalt 

sdf d6d: a2ceepeve cafe (840!) wiageH “vsviaepess sineg 

ae 37 » lierat ace toga ui Zi av nese $ a9 hts 

iS 2tstpots84 _ ant? ‘pf: aenes st ghye*s in + 

im bryos (£991) neeo Boe (feesn, (eseruésit® Che eegae 
' ; ; f 

ini ¢ Idoanqnon! Citas ae fnare seettr’, 25908 id she nega 

a4 on sceg@as fle “(ese As \agole Whe sao Ts 

erste" 

iw @caloegims adt-ni owl lersbienes det? >ege- 16ggenee 
ait @i .wOTwNOgeu20n 244 $6¢3end his Case. (aan 

snags 7! 26 pelbuse SeoM Al eadea? ‘isecd,a4 

qeae fesehianes. 48a -imet aogeliMeye. cree?) OEe ane 

fC) Civ, (i-gsiee “Zaviapabet Bevnept baw 8 no 

Cy ciggn cenlaqvrdiie (Nita Halt ints MibehNe a ot 
ad har Bo hbeise! 176 UU agate ms 2 5 vow 1%: 


° 


2 


represented as vowel to vowel diphthongal gestures with an 
independent consonant superimposed on those gestures. 

Delattre eteal.a( 1955) Mdisciss second! formant 
transition loci and postulate that since they reflect the 
place of articulation of a consonant, all homorganic conso- 
nants should have the same second formant locus. For 
example; thevsecond formant transitions from /b/)/p/, and 
/m/ into a following vowel would have the same frequency 
patterns Since formant transitions reflect the movement of 
articulators from the consonant to the following vowel, and 
/b/, 72/7 ,2and /m/ealishaverthe® same place of articulation. 
In a later paper, Delattre et al. (1962) present the results 
of an investigation of second formant transition loci for 
English fricatives. As in Delattre's earlier stop experi- 
ments, they synthesized stimuli with flat (horizontal) 
second and third formants at various frequencies. The locus 
found for the second formant of /a/ was” 14008Hz+ othe locus 
tory /v/ was 700 Hz.) The placement? oftthe k2 Locus for /0/ 
is between those found earlier for /b/ (700 Hz) and /d/ 
(1800 Hz). This is the expected result since the second 
formant transition varies according to place of 
aregculabionsand thesplacesof articulation oOfs70/.41S close 
tomthatwot/d/, but slightly moresanterior. ©) Thes/vy sand e/b/ 
loci were found to be the same, indicating that the place of 
production of these two consonants is the same. 

This brief review of spectral properties of stops and 


fricatives indicates similarities between homorganic stops 
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and =irmeatives with respect to.(1) the frequency spectra of 
the noise portions and (2) the consonant-vowel formant 
Peansmeuons. peliat ts, ithe snorsesciethe bursteot ihe stop 
/b/*and the noise of the frication of the fricative /v/ are 
comparable in noise frequency. Likewise, the stop /d/ and 
thetivucative /0/ *should be similar. As welll. the 
consonant-vowel formant transitions for the labials and 


those for the dentals should be the same. 


B. Onset 

While studies comparing perceptually relevant qualities 
of stops and fricatives are rare, many studies have been 
done comparing fricatives and affricates. (For instance, 
Dorman set als, 1980° Gerstman, 1957* Howell -& Rosen, 1983: 
Repp et al., 1978; and Van Heuven, 1979.) In studies of 
English consonants, the particular fricative and affricate 
usually considered are /J/ and /tJ/. The cues for the fric- 
ative/atiricate distinction are duration sand rise t yume 
(Gerstman, 1957; Cutting & Rosner, 1974; and Howell & Rosen, 
1983); 

in initial position, Gerstmany (1957/7) “showed that wapid 
GiSestpimeseand Drler durabions sol etGicative NMOoLlse@readsco 
the sperceptiom of affricates, while slower rise times and 
Pongammrnicat ean cadmcombhe  percepEtonsot Emicatives:. 
Gerstman claimed that fricatives are distinguished from af- 
frigates wprincipally on the basis of rise tame. s Gutting and 


Rosner (1974) studied rise times in both speech and 
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non; speechistimula Vand Loundsthat “musical sstimudais (Sawtooth 
and sine waves) with onset rise times of less than 40 ms 
werepidentiriiredeas "plucked! 92 "percent of thestime. 
Stimuli with rise times of 50 to 80 ms were called "bowed" 
87 percent of the time. This indicates, generally, that 
listeners can distinguish signal quality based simply on 
duration of rise time. Using speech stimuli, Cutting and 
Rosner found results similar to those of the musical 
Stimuli. Obstruent consonant noise signals with rise times 
from zero to 30 ms long were identified as /tJ/ 88 percent 
Ofmtrhesbimess thoseswithsrise times 60y) 70s sor S0ems) long 
were called /J/ 80 percent of the time; and stimuli with 40 
or 50 ms rise times were ambiguous. From these results, 
Cutting and Rosner concluded that the auditory mechanism has 
a natural sensitivity at approximately 40 ms. 

Howell and Rosen (1983) dispute Cutting and Rosner's 
results. They found that the mean rise times for affricates 
and fricatives were 33 and 76 ms in running speech and 49 
and 123¥ms insasolation for /ti7 and®/J7, respectively, in 
nonsense syllables they were 61 and 120 ms long. Howell and 
Rosen also attempted to measure the rise time of the voiced 
affricate/ds/ and» the vorced!iricative 7s/7.. At the! same 
time, they remeasured /tJ/ and /J/ using a different 
procedure. They found mean rise times for /as/ of 49 ms and 
FOvmy SyOceCUMms, lors £)/ e5oemstandetors, /7e0gemsee. The 
comparison indicates only a small difference between the 


rise times of voiced and voiceless consonants. 
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Howell and Rosen use their results to dispute Cutting 
and Rosner's claim of a "natural auditory sensitivity” at 
40 ms. The rise time boundary must not be fixed at 40 ms 
because rise time varies according to context - running 
speech versus isolated words. And, even if there were no 
context variation in rise time, the boundary between /tJ/ 
and /J/ could not be at 40 ms because this would result in 
almost all of Howell and Rosen's measured stimuli being 
classified as fricatives. 

Howell and Rosen noted that their own measurements of 
rise times are notably longer than Gerstman's (1957) 
measurements for /tJ/ and /J/. Gerstman found affricates 
with rise times as short as 5 ms. When Howell and Rosen 
synthesized /tJ/ and /J/, they found natural sounding af- 
fricates with 30 to 50 ms long rise times. Although they 
have yet to determine the cause of the discrepancies, Howell 
and Rosen do concur with Gerstman that rise time 
distinguishes (voiceless) fricatives from affricates. 

Howell and Rosen's findings, and the findings of others 
comparing stops with affricates and affricates with fric- 
atives, indicate categorical perception of the manner 
classes based on differences which vary along acoustic 
continua. One of these continua is duration of onset rise 
bimess The present study investigated the role of the 
duration of onset in the identificativon of stops and 


fricatives. 
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C. Duration 

Rise tumesiswone factor invthe fricative-attricate 
distinction; another separate but related factor is conso- 
nant duration. Van Heuven (1979) reanalysed Gerstman's 
(W257) Bdata,atactoring out variances due to duration and to 
rise time in order to verify .Gerstman's analysis. Van 
Heuvens -ounderhat duration accounts for 7/5 percenteor, the 
response variance in the data, and rise time and steady time 
each accounted for only an additional seven percent of the 
variance. This contradicts Gerstman's analysis which 
attributes the fricative/affricate distinction mainly to 
risestimey not to overall duration.) Other studies) in. which 
duration 1S critical merit mentioning to show the variety of 
Situations! anewhtch) Guration as amportant inadistinguishing 
Speech segments. Liberman, Delattre, Gerstman, and Cooper 
(1956) found that the duration of the consonant to vowel 
formant transition was a sufficient cue for distinguishing 
stops from semivowels. As the transition duration increased 
the stimuli were increasingly perceived as semivowels. 
Durataone also seems) to; besa’ factorsam thes fricative=stop 
Gistinection. Grimm (1966) noted that truncated: fricatives — 
fencativesuwurieparteor thei rication removed ssawere 
generally heard as stops. Carden, Levitt, Jusczyk, and 
Walley (4S80)" also found) that: thes "primary cue) to che 
Btop-fricativesmannersconstrast an CV syllablessicssthesdura— 
Eloneotethemaperiodic moise that precedes che formant 
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A few studies have measured the duration of stop and 
fricative consonants and have come up with somewhat varying 
measurements. Umeda (1977) found the mean duration of /v/, 
word initially before a stressed vowel, was 78 ms. He found 
that 70/7 is*usually shorter in’the same position: 52 ms. 
Umeda attributed this difference to the fact that the only 
English words that begin with /0/ are function words which 
tend to be spoken relatively quickly. Malecot (1968) 
measured the duration of the word initial /v/ before a 
Stressed vowel as 202 ms. (Malecot did not include /@/ and 
/d/ in his investigations.) Abbs and Minifie (1969) gave 
the following mean durations for /v/ and /d/ in initial 
DOSMeTOM ane CY wsylilables oN /v/e-m 153 ms! fd7e= 1238mse The 
variation found in the duration of /v/ for these studies is 
remarkable: Umeda - 78 ms, Abbs and Minifie - 153 ms, and 
Malecot - 202 ms. The variation may be explained by the 
fact that word initial stressed consonants show a relatively 
large variance compared to word-medial and word-initial 
unstressed consonants (Umeda, 1977). 

Another possible reason for the variation in the 
measured duration of the fricatives is speaker variation. 
Variation “in the articulation of -Enicatives’ has*been moted 
byahughes andvHatle(1956)), Klatt and ecooper™(1975))) Malecot 
(1968), and Nartey (1983). Hughes and Halle found great 
discrepancies among the spectra of a fricative spoken by 
different Speakers in various ‘contextss) Klatt noted that 


the duration of the fricative /J/ was variable across all 
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Ehe speakers Studied.’ Malecot found differences existing 
among subjects in the force of articulation of consonants, 
including fricative consonants. He also noted variability 
in the amplitude and duration of the consonant. As well as 
expecting variation in the duration of consonants produced, 
we Should expect variation in the spectra and intensity of 
the consonants. 

Many studies meaSure the duration of pre-burst closure 
fore’stops.) in comparison to fricatives; (it is the sduration 
of the noise portion of the stop, rather than the pre-burst 
closure, which is of interest because it is the burst noise 
Ehateis comparablekto the» frication noises “Klatt. (1975) 
measured the VOT of stop consonants (from the release of the 
plosive to the onset of vertical striations in the second 
and higher formants of the vowel). He noted that for voiced 
stops, burst duration equals VOT by definition. He made the 
following measurements of voiced stops: The burst duration 
fori /b/ was on average, 11 ms+,for /d/,.-1t7/>ms. The standard 
deviation for his measurements waS approximately 5 ms for 
botha7bysande/d7. 

Klatt explains the short duration of /b/ as a result of 
amrapidmlabtal melease.—» Further, the rapidereléase 
generates a burst spectrum that is weak in intensity because 
Pneressoeno ceSonating cavity iIneEbonte Ore cnen!1pSeuy ine 
Short duration and the weak intensity of the /b/ release 
cause the burst to be judged as not very loud, since the 


loudness of the burst is proportional to both its intensity 
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and icseduration. "This attributerot /b/* burstsamay account 
for identifications of truncated, or otherwise altered 
syltablerinitwval consonants, as /b/:) The’ burst) of! (by as 
very low in intensity; therefore, a consonant with no burst 
OPiiette burstemigqne. pe: identitved as a by: 

Although there is a large difference between the dura- 
tions of stop bursts and fricative frication, there may be a 
durational continuum between the two. The experiments 
conducted here altered noise durations of stops and fric- 
atives to test for such a relationship. The large 
differences in the original durations of these manner 
classes indicate that duration may be a major factor in 


distinguishing stops and fricatives. 


D. Intensity 

Not only does intensity play a role in distinguishing 
fricatives from each other, there is evidence that intensity 
influences the perception of voicing and place of 
articulation of speech sounds. It is well accepted that 
intensity distinguishes the fricatives /s/ and /J/ from /f/ 
andy 0) .eeMcCasland: (1979) found» thats, and 7uy sere 
identified as /f/ and /@/ if they are spliced to replace /f/ 
or /@/ in a CV syllable and attenuated in amplitude. When 
Ve/@ande /i/7 Teach) the Level) Ofeancensity: Oty t/2and ey, 
they are identitied as /i/ or /¢/ depending on the original 


consonant. 
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Abbs and Minifie (1969) measured the intensity of the 
FRiG@acives@/S7, 27 ff/ 7810 / mandi 707 ein, CVeancsvCc 
syl¥ables.ieTheyrfound that./s/ and’/z/ were *significant ly 
more intense than the other fricatives. This study supports 
McCasland's evidence that intensity is a cue used to 
distinguish fricatives. 

The evidence of the effect of intensity on perception 
OPEvVOVeCingG TSenot asmconsistent as ithatmoblits#role tin 
distinguishing fricatives. Minifie (1973) claims voiced 
fricatives are less intense than voiceless fricatives 
because of the reduced intraoral pressure in the production 
of voiced fricatives, even though Abbs and Minifie (1969) 
found no consistent differences in intensity between voiced 
andwunvotced fricatives. So; tt ts not clear whether or not 
intensity is a reliable cue for the perception of voicing in 
Enveatives. 

Repp (1979) found that amplitude of aspiration noise is 
a cue for the distinction between voiced and voiceless 
syllable-initial stop consonants in English. Repp 
identified a trading relationship between the amplitude of 
the aspiration noise and VOT, in the identification of 
synthetic /da/ and /ta/ syllables which varied along a VOT 
continuum. An increase in the aspiration amplitude resulted 
in shift of the voiced-voiceless boundary towards shorter 
VOT. 

Ohde and Stevens (1983) studied the effect of the 


relative amplitude of the release burst of a stop on its 
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place identification. They varied the amplitude of the 
first 10 to 15 ms of voiced and voiceless synthetic stop 
stimuli which varied spectrally along a labial-dental place 
continuum. Ohde and Stevens found that "the relative 
amplitude of the burst significantly affected the perception 
of the place of articulation of both voiceless and voiced 
stops" (p. 706). An increase in amplitude biased a response 
toward the dental consonant over the labial one. This 
effect was noted especially in stimuli with spectra 
intermediate between labial and dental end-stimuli. 

Malecot (1968) measured the amplitude of English stops 
and fricatives in various positions within a nonsense 
disyllable. His study is one which compares stops and fric- 
acives, mbutevtsdoessnol 1neluder/ 07 sandm/é7..unimeinitial 
position, Malecot found that the voiced stops /b, d, and g/ 
have greater amplitudes than the voiced fricatives /v/ and 
WZ 

Since only voiced consonants are being considered here, 
studies which compare intensity with respect to voicing are 
relevant only in showing that intensity affects 
identification. The same holds for the effect of intensity 
on fricative identification, since we are interested in the 
Comparison Of stops and fricatives. =From Malecot ¥s evidence 
we would expect increased intensity of an ambiguous conso- 
nant stimulus to bias response towards stop, rather than 
fricative. And Ohde and Stevens' study suggests higher 


intensity will increase the proportion of dental responses. 
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It is to determine the validity of these conjectures that 
intensity iS included as a factor in the present 


investigation. 


E. Consonant Confusions 

Consonant confusion statistics are relevant to the 
current study for the following reasons: 1) They specify 
errors in consonant identifications to expect. 2) They 
indicate what proportion of confusions between two 
consonants iS normal. And, 3) they give an estimation of 
the distance between consonants in perceptual space. This 
allows us to speculate on the salient distinctions between 
consonants. In order to get an idea what identification 
errors are common among the four consonants under 
consideration, the results from four consonant confusion 
Studies have been tabulated. The studies included in the 
table are Kent, Wiley, and Strennen (1979); Miller and 
Nicely (1955): Tolhurst (1954); and Wang and Bilger (1973). 
Reporting more than one study allows for the comparison of a 
Variety of results; this 1S important because there are some 
discrepancies in the proportion of misperceived stimuli and 
the usual choice of substituted consonant. Also, the 
results of consonant confusion studies are not easily 
replicated, suggesting that any claims must be considered in 
HiGghosoteOuher findings. 

The table of consonant confusions studied, Table 2.1, 


shows the most frequent misperceptions are between /b/ and 
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Table 2.1 Results of consonant confusion studies. 


Study 


Miller 


& Nicely Tolhurst 
C1954) 


(1955) 


rare 
many 
few 

rare 
many 


few 


Rake 


many 


some 


few 


many 


some 


Wang & 
Bilger 
(1973) 


rare 
many 
some 
rare 
many 


few 


"rare" indicates approximately 1% occurrence 
"few"! indicates approximately 5% occurrence 

"some''! indicates approximately 10% occurrence 
'many'' indicates approximately 20% occurrence 


Kent 
Confusion (1979) 
/b/-/d/ none 
/b/-/v/ some 
/b/-/5/ few 
/v/-/d/ none 
/v/-/6]/ some 
/d/-/&/ some 
Note. 
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/v/ and between /v/ and /d/. Confusions between /d/ and /3/ 
and between /b/ and /0/ occur occasionally, but between /b/ 
and /d/ and between /v/ and /d/ they are rare. In general, 
/a/ is well-perceived, even under poor acoustic conditions 
(such as high signal to noise ratios and narrow frequency 
ranges). On the other hand, /d/ is frequently misperceived. 
Kent et al. (1979) found that at 40 dB SL all consonants are 
perceived correctly more than 80 percent of the time, except 
/G/ ‘ana /30/. 

The prevalence of /v/-/2a/ confusions is interesting 
because it involves an error in place identification. The 
phonemes /v/ and /3/ are often confused, even under ideal 
conditions. ~Miller ande Nicely (1955) found’ that’ /v/=/o0/7 and 
/£/-/0/ distinctions were among the most difficult for 
listeners to make. Wang and Bilger (1973), who varied 
Signal to noise ratios and presentation levels, found that 
/0/ was identified as /v/ twice as often as it was correctly 
identified. And Kent noted that even at 60 dB SL /v/ and 
/o/ were identified correctly only 93 percent of the time. 
The high rate of confusion between the labial and dental 
fricatives suggests consonant-vowel formant transitions are 
notealways  SUELICIente to eSstablnSh place or articulacion, 

More surprising than the /v/-/0/ confusion is the 
/b/=/o/sone, since ithe: latter diftter°in both manner and 
place. SDespite their diifering articulatory features, these 
phonemes apparently are somewhat similar acoustically. The 


Vo/-/ /Merrors do not occur very frequently, sbut they do 
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occur in all the studies reported. The labial/dental mis- 
identification is another demonstration of the relative 
dwttreulty ofsdiseerning place sof articulation; as apparent 
in /v/-/d/ errors. The manner confusion suggests similarity 
between voiced stops and fricatives. Manner confusion also 
appears in the /v/-/b/ substitutions, which are frequent, 
and the /@/-/d/ substitutions, which occur occasionally. 
The relatively high occurrence of stop-fricative confusions 
indicates strongly that these consonants are close together 
in the auditory system's perceptual space. Under less than 
optimum listening conditions stops and fricatives are 
confused. 

Carden et al. (1980) conducted an experiment designed 
to explore not just consonant confusions but particularly 
stop-fricative confusions. As their aim 1S Similar to the 
aims of the present study and since Carden et al. achieve 
interesting results, their study will be discussed in some 
deta wie. 

The experiments of Carden et al. investigated the 
relationship between manner identification and place 
identification. Manner-ambiguous stimuli were created by 
truncating the frication from natural fricatives. From 
eubwvectecategoni zations of /6/,4/V/0/07,, and a/07,.cncy 
concluded that perception of place is dependent on the 
manner perceived. In one experiment, they found that the 
proportion of labial responses was greater when a stimulus 


was identified as a stop than when the same stimulus was 
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identified as a fricative. For example, the truncated /6/ 
was identified as /b/ 27 percent of the time and as /6@/ 20 
percent olm@the time yee! & was tidentrivedlastanlabtaliys 
percent of the time pirewas identifiedvas#a stop, butaas a 
dentaln64 percent of “the ttime it was.identified as a 
fricative. Carden et al. stated the phenomenon thus: 
",..perceived manner affects the perception of the 
EOrManusesansitironmeue! Wp, 79). They suggeststhat 
listeners compare incoming transitions to one of two 
labial-dental boundaries, one boundary dividing the stops, 
the other separating the fricatives. Two boundaries are 
proposed because of the difference in place of articulation 
between the labial stops and fricatives and between the 
dental stops and fricatives. Since the stops and fricatives 
are articulated at slightly different places, their formant 
transitions would be different, and the optimal boundary 
between the labial and dental stops would not coincide with 
the optimal boundary between the labial and dental fric- 
atives. Carden et al. further suggest that the fricative 
tEaNSibtions, for /£/ and /6/ at least, Lie on the labial 
Side of the labial-dental stop boundary. This explains why 
truncated /6/ is perceived as labial when identified as a 
stop. 

Onesdtfticulty with this explanation iSeits#inabi lity 
to account for the identifications of truncated /0/, which 
was mainly called dental, not labial, whether identified as 


alstop orsa fricative. “Carden et al-esuggest 707 1s 
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to distinguish stops (with abrupt burst onsets) from affric- 
ates (with slower onsets) and fricatives (with very gradual 
onsets). Since natural stops are much shorter than natural 
fricatives, short duration of noise should indicate a stop 
while long duration of noise should indicate a fricative. 
Binally?, Ssince®stops are higher tin intensity ethan ~iric= 
atives, higher consonant amplitude should influence 
perceptions of consonants toward stops. To verify these 
conjectures, duration, intensity, and slope of onset are 
investigated in the experiments described in the next 


chapter. 
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III. Experimental Method 

Experiment I involved manipulating stop consonants to 
have them perceived as fricatives. Experiment II involved 
alteringofricative consonants: tombe perceived as stops 2. The 
Sonsonantsmstudtedcvarer/by/ ,8/d7 0/17 ,eana 707. PAY recording 
was made of two male Canadian English speakers saying words 
starting with these consonants in the sentence frames, "I 
ikem chatter. eraand.” 1 block#s 3,.9.sThestrames produce 
speaking rates in line with conversational speech. One 
Speakeri recorded thetstop-initial words@for vixperiment. i. 
The other speaker recorded the fricative-initial words for 
Experiment II. These two speakers produced consonants which 
were typical in duration and shape of onset. The two 
speakers recorded were chosen for their stable, clear 
enunciation and intonation of the words involved. The 
speakers recorded each stop and fricative initial word in 
the sentence frames a number of times. From these 
recordings one token was chosen as the original syllable for 
the creation of the experimental stimuli. 

The words recorded were chosen so that the consonants 
occurredsbeforerla highs front vowel) 71/7, \(high f2)¢saylow 
vowel, /x#/, (mid F2); and a high back vowel, /o7, (low F2). 
The vowel /o/ was chosen instead of /u/ because of wide 
Speakecevardationein pronunciationeobe/U/ Hnythis areawor 
North America, and to avoid the dialect which pronounces /u/ 
asu/yuy. ‘The attempt to study onlywreal English words, 


rather than nonsense syllables, also favoured the choice of 
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the vowel /o/ over /u/ because of the greater variety of 
monosyllabic words with the vowel /o/. 

The words chosen for this study are partners in minimal 
pairs - one word in the minimal pair begins with a stop, and 
the other word begins with a fricative. These words are 


W 


"pee", “vecumlasein the letterey): "bat", "vat"+ "boat". 
“vote: “decle(acsimathe letter DD), “thee -s'panwemtenan ”: 
TOOuUgGh", andmepenough,"  PFigumesces 1, 3.2, ands sechow 
spectrograms of these words. The spectrograms are grouped 
by vowel to facilitate comparison of the four consonants 
before the same vowel. The comparison shows the formant 
transitions from the English labial stop and fricative and 
dental stop and fricative into the vowels /i/, /e/, and /o/ 
are Similar in frequency and direction of change, indicating 
that CV formant transitions do not distinguish homorganic 
stops and fricatives. The main difference between the stops 
and fricatives appears to be the burst onset of the stops 
and the periodic noise of the  fricatives. 

It is important to use real words when the consonant 
/0/ is being studied because of the limited occurrence of 
this consonant in the language. It is also especially 
difficult to find minimal pairs involving the consonant /20/. 
In some instances, such as "Dan" and "than", the available 
real words are suspect in their status aS a word. "Than" is 
problematic, as are all English words Stareincuwith 707, 


because it is a function word, not a content word. Function 


words are pronounced quickly, usually with reduced vowels. 
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Figure 3.1 
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Spectrograms of /baet/, /daen/, /vaet/, and /dhaen/. 


Figure 3.2 
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"Dan" raises difficulties because the vowel pronunciation 
varies depending on the speaker. In some instances in 
running speech, and consistently for some speakers, the 
vowels in "Dan" and "than" are not the same. Articulating 
the stimulus words in sentence frames helped to alleviate 
these problems by making all the words have the same 
position and function in a sentence. A word at the end of 
either sentence frame could be regarded as a noun, given an 
appropriate semantic context. While placing the stimulus 
words in a sentence frame may gloss over differences in 
linguistic roles, it is desirable in a perceptual study 
because it allows us to concentrate on the acoustic 
differences between the speech sounds. 

The recorded sentences were band pass filtered from 68 
Hz to 6800 Hz to eliminate low-frequency hum and prevent 
aliasing. Next the sentences passed through an 
analog-digital converter where they were sampled at a rate 
Of6 KHZ.) The) particular word tos be; modittvedrwasvextracted 
from thesicarrier phrase and stored) inva five in ay DEGsPDP—i2 
minicomputer. Up to this point the procedure for developing 


the stimuli for Experiments I and II was identical. 
A. Experiment I 
Stimuli 


The stimuli for Experiment I varied in duration, 


abruptness of onset, and amplitude of the word-initial stop. 
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In brief, duration was altered by repeating the word-initial 
consonant. Abruptness of onset was modified by windowing 
the original syllable to remove the burst of the stop and by 
windowing to create a very gradual onset. Amplitude was 
varied by multiplying the consonant portion of the syllable 
by two, effectively doubling the amplitude of the consonant. 
In constructing the experimental stimuli care was taken to 
create tokens with oscillograms which resembled natural 
speech and which sounded like natural speech. 

The durations of the word-initial stops to be studied 
in Experiment I were found to vary greatly. The duration of 
the word-initial consonant burst was approximately 10 ms for 
the words "bat", "Dan", "dee", and "dough." The 
corresponding duration for "bee" and "boat" was, however, 
only 6 ms and 4 ms, respectively. The noise of these two 
consonants was extended to enable all the stimuli to be 
comparable and modified similarly. Extending the consonant 
noise was also necessary for the changes in duration; 
repetition of short /b/'s created signals which did not 
sound like speech. To create a 10 ms long /b/ from the 6 ms 
one of /bi/, the last 4 ms of the 6 ms burst was attached at 
the end of the 6 ms. To create a 10 ms /b/ from the 4 ms 
one of /bot/»> the last 3 ms of thesorirginal was “sepeated and 
attached to the sorigqinals” ‘These vextended=consonant 
syllables were not used as "original" words. The 6 and 4 ms 
burst consonants were present in the experiment as the 


original consonants. But in situations in which the 
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consonants were modified in any way, the extended-burst con- 
sonants were used. 

One modification of the stop consonants for this exper- 
iment was changes in abruptness of onset. This involved two 
separate alterations - one removed the burst of the stop, 
the other windowed the stop by a ramp to create a gradual 
onset. Removing the burst was done by multiplying the 
Original signal by a 3 ms long cosine squared window. By 
doing this, "burstless" consonants were created. The 
modification used to produce Stimuli with "gradual" onsets 
was one of multiplying the original stimuli by a linear ramp 
window rising from zero to one in 10 ms. This windowing had 
the effect of removing the burst of the consonant and 
decreasing the amplitude of the signal immediately after the 
burst. With these two onset-altering modifications, two 
distinct sets of stimuli were produced from the stops: 
syllables with initial burstless consonants, and syllables 
with gradual onset consonants. 

Figure 3.4 shows all the windows applied to the 
Experiment I stimuli. The cosine-squared window used to 
remove stop bursts appears first in the figure. Next is the 
10 ms long linear ramp used to create gradual onset conso- 
nants. Below that appear linear ramps 20, 30, 40, 50, and 
60)ms long. The latter are applied to the increased dura-— 
tion stimuli of Experiment I as described below. 
Oscillograms of the stimuli discussed thus far can be found 
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Oscillograms of windows which modify Experiment I words. 


Figure 3.4 
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unmodified words used in this experiment. Figure 3.7 
Contains, Oscillograms of the first 20 ms of the words, 
showing the original unaltered stops in detail. Figure 3.8 
Shows the six stops after windowing by the 3 ms long window 
that removes the burst of the consonants. Oscillograms of 
the consonants after being windowed by the 10 ms long ramp 
are displayed in Figure 3.9. 

The variation in duration of the word-initial consonant 
was created by repeating the original consonant noise from 
two to six times. If the abrupt onset were left on the con- 
sonant that was repeated, the result was not recognizable as 
a speech sound because of the periodicity created by the 
repetitaon of the burst. To ameliorate this condition, the 
repeated sections had their bursts removed by application of 
a 3 mS cosine squared window. The duration repetition was 
done by concatenating the burstless consonant to itself, 
followed by the vowel. The concatenation generated 
consonants two, three, four, five, and six times the 
original consonant length. Because of a buzz still heard 
due to the periodic fluctuation in the noise amplitude of 
the consonant, these tokens were further altered by 
multiplying them by a linear rise ramp of approximately the 
Same duration as the consonant. A 20 ms ramp was applied to 
the stimuli created by doubling the consonant length; a 
30 ms ramp was applied to the stimuli that had consonants 
three times the duration of the originals; and so on. The 


ramped onset stimuli were more natural in structure, as well 


S.E eter? 
esbnta gaol ene sda yd ot 
22 wna getttee0 -ginanosnos as $6 sewed of aevouir 3 
of en 0) an? <a beuouney prisd t4¢ts zinanoanes 
C.6 watt a8 poyaiqeld # 
sitav off 


aa 


qe pn 


jawgoanos Lalsini-biew sty te fieltetmr af Le 
@aai. @eborn saenesies Penpesad ad? ceriseene: heravto & 


=e oat?) 1b. JasL stev s9270 199 146 ais YW ,seet7 “le OF 
gs aidedianh a “ion eae dlues7 AAT (fetesae: Sav sans sonnos 
at? od Sazests Wsisibetuag: it (6 eeuaae arvee doeeqge. 

eGo .notafeeor Silt Siero! lame ay 422509 saa th vols amare, 
in eobeepiidge qa°Ravene: eszud ties hen Sagtsoen bed 
any Wolriesd=? holisti® siT .volnse beteupe saleoo ae E - 
Vivedt on srenaeher Beslinnuid ety onitwiaraeden Ge leged - 
hoasisnae Adi¢nia songs adv .isve Sil ye cowolled 

ad= qen@s sia bah avid 280) . eels ‘V3 Sonenoanes 

giaed Lilae ead s Oe ocean! (Senet taneensa lantgizo 

‘ey shut) ane agen std 7 raaiccau | ond) peg etd: O37 * 

oT a) GAse4Le con tey - 213%) age); Sauna r nanoonad mes 

eas “besentxoigqee (6 quis. v/s eral 6 & cmls aiiytgiaio J 
o2 be! [p7e S68, QmRs on of 4 ,Sannazics ade =n polbéagye s F 
a itoenet Afeneene> sn7 outidbeb yo Satepin 4A06RaG ee 

- 

saneqeenoe Wnl eoit Mluniss ad? 2 Seclqon eae Quel OB Oe 
ae «0 we Ons Kelealeiza ett Ad. neyesive Qe Geek tA: 


iéaw aa: ,atudausfa. i) de 707 atom 2108 ~)lomidp 


37 


SW NI NO[LYYNG 
BM b US BUS USe2 BBea aa | a) Vale WS V2) 


o-ovvar a HAHAH ng 


IETS 


—— yn 


enraniginlnarhir egal flf ale hiboynlgf ate 


apt teteie ob ge a- 


Oscillograms of unmodified words beginning with 


labial stops. 
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dental stops. 


Figure 3.6 
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Oscillograms of burstless stop consonants. 


Figure 3.8 
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Oscillograms of gradual onset stop consonants. 


Figure 3.9 


: ic 1: oe - 
=. 
Wy, 


A _ —— : : athe?) ue : G 4 fi a 
_ Ral ae Co al a ani eg = Nihari oe 


7 ‘ oo Wis 
_ : y 

; /\ Ny F 7 Au. <4 _ i i } ie, ' 
ata) ae prt Depry 


. f 
j f 7 
- : - 
iat ; —— See See ————— ae —_ > = —————— 
_ _ FS ae ae rc ; | - he | er : 
e vit ditt Arle 
-_ nm 


42 


as sound, than the unramped repeated consonants. 
Structurally the stimuli resemble fricatives, which have 
gradual increases in the noise amplitude from consonant 
onset to vowel onset. The concatenations resulted in conso- 
nants with duration ranging from approximately 10 ms long to 
approximately 60 ms long in 10 ms steps. To illustrate the 
effect of repeating the consonant to increase the duration 
of the consonant and of applying the ramps to these conso- 
nants, Eaqure 3.t0ccontains the /di/7 Stimuli of all six 
durations. 

The three modifications discussed so far, removal of 
burst, application of ramp window, and change in duration of 
consonant, resulted in 48 stimuli: 6 originals (2 consonants 
With) 3S vowels), 6 without a burst, 6 with gradual onsets 
(ramped), and 30 with various consonant durations. Each of 
these tokens was used as the base for the amplitude factor. 
To vary the amplitude, the consonant section of a stimulus 
was doubled in amplitude by multiplying each point by two. 
This resulted in 48 stimuli with increased-amplitude conso- 
nants. Experiment I involved 96 stimuli in total. These 
tokens embody variations in presence or absence of stop 
bursupwogradual for abrupt onsel, cduratvon Or pim1elal conso> 
nant, and amplitude of the initial consonant. 

Figure 3.11 provides more sample oscillograms to 
illustrate the construction of Experiment I stimuli. Again, 
these traces are based on the syllable "dee." Included in 


this figure ane oScillograms of the first 40 ms of the 
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vary in duration. 


Figure 3.10 
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stimuli. 


Fagure 3.11 
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Grivginal fdiy syllable, this syllable with the burst 
removed, =the first increase infduration due to repetition of 
the burstless consonant, the increased-duration syllable 
ramped, and finally, the ramped trace modified by an 


increase in the amplitude of the consonant. 


Presentation 

The methods of presentation used in the two experiments 
were different. For Experiment I, the stimuli were 
randomized and three replications of different 
randomizations were recorded onto tape. The stimulus words 
were placed at the end of the carrier phrase "Please say the 
word... ." Ten practice sentences preceded the 288 (96 
Stimuli x 3 replications) items and two filler sentences 
followed to round the number of items to 300. A 3 ms pause 
was inserted after every tenth sentence. The entire tape 
was 20 minutes long. It was played back on a TEAC A-7030 
GSL stereo tape deck over Telephonics TDH-49 headphones in a 
sound treated room. A sine wave which covered the 
minicomputer's range of amplitude output was used to 
calibrate the recording level so that the stimuli would not 
be clipped. The same sine wave was used to set the the 
playback level on the TEAC tape deck. 

Subjects were given an answer sheet which included 
instructions, a confidence rating scale, and numbered 
choices of two words corresponding to the stimuli on the 


tape. (A copy of this answer sheet is included in Appendix 
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B.) Subjects were asked to choose which word they heard at 
the end of the sentence, "Please say the word... ." They 
weresgiven aitchoice of "a word starting with either a stop or 
a fricative. The vowel and optional consonant at the end of 
the word corresponded to the word heard on the tape at that 
item number. A word beginning with a stop and a word 
beginning with a fricative appeared in alternate columns for 
each item. The alternation was designed to remove any bias 
towards words in either column. Subjects were also asked to 
indicate their confidence in their response on a scale of 
one to three, three representing high confidence and one 
representing low confidence. They indicated these 
confidence ratings by circling the appropriate number on the 
response sheet. 

This design prevented subjects from confusing the place 
Stvartrcubarton of the word-initial consonant. For 
instance, if a subject was presented with the word "thee", 
he or she had as a response choice "dee" or "thee." The 
subject could not respond "vee", even if the item was 
Perce lived as starting with ae/v/ a “(Two op "subjects did; 
however, indicate when they heard a /0/ when given only the 
choice of /b/ or /v/. These responses will be discussed 
bovebly linwChapters V1V “andavs) 

Every subject listened to the tape at least four times. 
Subjects 1 and 5 listened six times. This gave twelve 
replications of each stimulus syllable for Subjects 2, 3, 


anai4¢vand eighteen replications fonySubjects ieand 5. 


mo queers ma nipriaibnted shee w/e ef rts 
fo bie ailg-th anenosno? Landidge Glia dewor eat 
ros Sucnied ath NG@Lsesi Hity eis -e4 baneoe 
biew = Gon qoze s dsivpolardped Ayew &) +s 


69 epelos sfenceaie |) Ss 1ssQue svize6oizi o ite 


> 


enid yar Svome) of Serpi22o.aRs nvéjamiadia ger 


eo? hades Geile Siew Bios cUr efter Tee le at shiae 


$6 sfese.u-HO Gelogdss ttre. ni) oonzéztecs 2iSaa 
gno-Sne eangelinas foi’ offs eets secs ,saant » 
ened? Vezesiial "eth -eonepi wis5 uel gore . 

63 ‘oa of QKro v7 gonise4 
dane 


af) @0 Sadun Gsaricetyge 


ssatevad) gaizulaes mp7 atostder Ses nsvory ivisel sift 


9 sareaessqas— ici e ifiesace Sas ty nohta aware 94 
7 

- = 

of 1 

sufsin Saneqgest & 28 Set @Ge se) me, 


‘asne* haos(ene adiv Ba Guess eas Costele & ai 
aay? “ .-.e07" 20° “aso 
250 teed ofgn2. wars. "$87" Breges? tan bigs? DBE 

,hif ateatdve 40) aw?) @) 17 (0 aereves> SEN | 
aa yitoo evip ngaly sy 5. SeO_QaN ray cas Mae oy? 
beeayoelh ea file a sgattT ‘va ay Te 

\ Veoh. Vi cuesqeianog gag 

semi ceed Seash ic aqas, sy oT) bedethl. teetevel aes 
oviawrsavae cldt caddis 4 Pe fenwtels * angi h 

(0 .¥ eaabfdlia) cel nies {yS enfumise- done Seem 


Aine ( sets}auk ac anobangtlges . “ | 


47 


Subjects 

A small number of subjects were used, but each subject 
heard each stimulus many times. The subjects for both ex- 
periments were English speakers from various regions of 
Canada and the United States who are currently residing in 
Edmonton, Alberta. The dialect of English the subject spoke 
was not expected to affect his or her perception of English 
consonants. The subjects reported having no known hearing 
impairments. 

Five people participated in Experiment I, four males 
and one female. All of the subjects were university 
students or staff, and all had some knowledge of phonetics. 
Subjects 1 and 3 were familiar with the design of the 


stimuli. 


B. Experiment II 


Stimuli 

Experiment II was designed to alter word-initial fric- 
atives to determine which factors would cause a fricative to 
be perceived as a stop. Modifications were made to the 
duration and amplitude of the fricative. In this experiment 
the amplitude factor had three levels. Since the original 
fricative waveforms were of such low amplitude, it was 
possible to multiply each consonant sby two twice The first 
level was normal amplitude; the second level was doubled 


amplitude; and the third level was quadrupled amplitude. 
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Each amplitude increase added 6 dB to the intensity level of 
ehe Trication, 

The second factor in Experiment II was duration. The 
duration of the fricative was decreased to resemble a stop 
by gating out middle sections of the fricative frication. 
Sections were cut out leaving three, two, and one period of 
the original voiced fricative. The periods that were kept 
for the three-period stimuli were the first two and the last 
period (before the vowel) of the original fricative. The 
two-period long stimuli were made of the first and last 
period; the one-period ones consisted of only the last 
period of frication before the vowel transition. The 
Original fricatives varied in duration, and because the 
fundamental frequency of the recorded words vary slightly, 
the durations of the shortened fricatives vary slightly 
aroundseoOse 20), and 10 "ms. 

By creating three new durations of consonant from each 
fricative, and by creating two new consonant amplitude 
sizes, 72 items were prepared for the second experiment. 
These items consisted of two consonants (/v/ and /@/) with 
PhreemvOowelsi 07 al e/ onde O/, )4 esol GUratLons mM Originda., 
three-periods of frication, two-periods, and one period), 
and three amplitudes (original, double, and quadruple conso- 
Mant amplitude). Figures 3.12 and 3.13 show oscillograms of 
the entire syllables used in Experiment II. Figures 3.14 
and 3.15 contain oscillograms of the consonant portion of 


thes original Bxperniment 11 syllables.) Figure 3.16 1s) a 
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with /v/. 


Figure 3.12 
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Oscillograms of Experiment II words starting with /dh/. 


Pigure 3.13 
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Oscillograms of a sample of the stimuli that vary 


in duration in Experiment II. 


Figure 3.16 
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sample of the durational stimuli used in Experiment II. The 
Past viagure,e3516,eshows the entire original@consonant from 
the syllable "thee", the stimulus constructed from three 
periods of thistsyllable;, the two-period stimulus, and» the 


One=periodsstimulusstromethe syllable Ethee. 


Presentation 

ihesmethod of presentatron of stimussetori experiment 11 
WaS Quitesditfierent virom thatefor Experiment 1s | sAftersthe 
presentation is described, the motivation for the 
differences will be discussed. The 72 Experiment II stimuli 
were randomized and presented to subjects in sets of three 
replications (216 items). These items were presented as 
isolated words over a Heco Sound-Master 15 loudspeaker at a 
comfortable listening level in a sound-treated room. The 
presentation was computer controlled, rather than recorded. 
The computer played the stimulus repeatedly until it 
received a response to that item. Then the computer played 
the next item on the list of randomized stimuli. 

For each item, subjects were given the choice of 
nespondang b/ pe7v, (G/,00rl/o/s i eSubjects respondediontga 
four column by three row touchpad. Each column represented 
one of the’ four word-initial consonant choices. The order 
of the consonants on the touchpad was rearranged every 
trial. The rows on the touchpad represented the confidence 
rating the subject associated with his or her response. The 


top row signified a confidence of 3 in Experiment I, i.e., 
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high confidence. The middle row indicated moderate 
confidence, and the last row meant low confidence. A 
subject responded by deciding which consonant the test word 
Started with and how confident he or she was of the answer, 
and he or she pressed the appropriate key. Usually, 
subjects listened to a word two or three times before 
responding. The PDP 12 minicomputer stored each response in 
a separate file for each trial for each subject. Each 
subject performed the three-replication trial six times, 
giving 18 responses to each stimulus item for each subject. 
AStrial took approximately 15eto 20 ,minuteés to perform, 
depending on the subject's familiarity with the procedure 
and the stimuli. 

The presentation of Experiment II stimuli was different 
from the presentation of Experiment I for a number of 
reasons. Mainly, Experiment II differed to allow subjects 
more choices in their responses, to allow them to respond 
with either a labial or dental place of articulation 
regardless: of the®place of articulation) ofthe original 
consonant. Comments from subjects in Experiment I indicated 
DeDncepEton errorsuineplace of articulation. eTheschangerto 
responding with any of the four voiced labial and dental 
Stops andurhicatives wWoulcdedllowescubjectoutovexprescerness 
perceptions. In addition, indications of a dependency of 
place of articulation on manner of articulation, as proposed 
by Garden et al.) (1980), could be examined trom™the 


Experiment II data. 
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The change from headphones to a speaker was made 
because some Experiment I subjects complained that the 
headphones were uncomfortable and because headphones are not 
usually involved in speech perception. The headphones had 
to fit tightly to reduce interference from outside noise, 
especially since subjects had only one chance to perceive 
each stimulus item. Usually sounds originate farther from 
the ear than headphone signals do. Perceiving audio signals 
through headphones may require special adjustment on the 
part of the listener (Gibson, 1966). For greater subject 
comfort and to increase the naturalness of the speech 
perception task, the second experiment stimuli were 
presented over a speaker. 

The touchpad allowed for four choices of consonant 
responses, and the three confidence levels as in 
Experiment I, and automated data recording because of the 
connection to the minicomputer. As well, using a computer 
and a touchpad allowed subjects to control the pace of the 
presentation. Subjects were able to listen more critically 
to the stimuli. Response errors due to accidentally 
touching the wrong key on the pad occurred, but subjects 
could correct errors by touching another key before the 
first was recorded. When the computer received responses 
from two keys it ignored both responses and replayed the 
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Subjects 

Five native English speakers participated in 
Experiment II, three females and two males. The Subjects 
were university students and faculty members. The subjects 
had no known hearing losses. Subject 5 had no knowledge of 
phonetics. Subjects 1 and 2 also participated in 


EXperiment I, as Subjects 1 and 3. 
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IV. Results 


A. Experiment I 


Measurements 


Duration 

Measurements were made of the duration and intensity of 
the Experiment I stimuli. These measurements were made on 
the six original words, the original consonants, and all the 
altered consonant stimuli. (The alterations included 
VanvartoneOL duration Of OnNSet. Of duration, and of the 
amplitude of the consonant.) 

Table 4.1 gives the durations of the original 
syllables, the original consonants, and the altered-duration 
consonants. The stimuli altered for intensity (amplitude) 
and duration of onset are not included in this table since 
these modifications did not change the stimulus duration. 
The first column in the table shows the duration of the 
original syllable; the second shows the duration of the 
vowel; the third gives the duration of the original conso- 
nant. The Last column iam Table 4.1 Gives the @duration of 
fnheastop consonant which was repeated to create the duration 
Vabyingustimuli, Since the five inereased duration Suimuli 
were created by concatenating the ramped consonant, their 
durations are not included in the table; their durations are 
simply multiples of the consonant in the last column. The 


stop consonant measurements show that the burst of /b/ is, 
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Table 4.1 Durations of Experiment I stimuli. 


Duration in ms 


Gradual 
Syllable Syllable Vowel Original C Onset C 
bee 330 Bye 6 10 
bat 365 Se) 10 10 
boat 300 296 4 10 
dee sei | 275 2 12 
Dan 37/6 363 We Ibs 


dough 334 Syay 12 UZ 
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Onmaverage, alittle more than half the duration of the 
DUESt Of /d/. The mean measurement of -/b/ is 6.7 ms;* the 


MeanvOLd/ 1S 126 meu 


Intensity 

The PDP-12 minicomputer was used to take measurements 
of the root mean square area of the stimulus syllables and 
the consonant portions of the stimuli. These measurements 
were converted to decibels to represent the intensities of 
the signals. The decibel measurements are presented in 
Table 4.2. The measurements have been averaged over the 
three vowels. As well as showing the intensities of the 
consonants, the table gives the mean syllable intensity as 
an indication of the amplitude of the consonant relative to 
the vowel. The mean syllable intensity for both the 
original labial and dental stop words in Experiment I is 
44 dB. The dental consonants have greater intensities than 
the labials. The dental stop /d/, with a mean intensity of 
20 dB, is 4 @B higher in intensity than the average /b/ at 
16 @B. The intensities of the stops drop 2 to 3 decibels 
when the burst is removed. And the intensity decreases 
again, by approximately 4 decibels, when the stop has a 
gradual 10 ms onset. The intensities increase as the dura- 
tions of the consonant portions of the stimuli increase from 
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Table 4.2" Intensities of Experiment I Stimuli (dB). 
ee eS ee ee es ea ee ee 


Stimulus 


Conso-) Orig— ~Burst- 10° 20° 30 40 50 G0 (Syllable 
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Recognition Curves 

The recognition curves of the consonants in 
Experiment I are generally consistent. In only a few 
instances did subjects reverse the trend of their responses 
as the stimuli varied along the continua. Mean results for 
all the subjects are discussed below; recognition curves for 
individual subjects are given in Appendix C. The individual 
Subject curves are Supplied to compare results across 
Subjects. 

Experiment I data are recorded as a proportion of stop 
(/b/ or /d/) responses at the various intensity, onset 
Slopes, and duration levels. Figures 4.1 to 4.6 show the 
mean of all subject responses to the stimuli with various 
onsets. The three onsets are labelled "burst" - the 
original, unaltered consonant; "no burst" - the consonant 
was windowed by a short cosine Squared wave to remove the 
abrupt onset; and "gradual" - the consonant was windowed by 
a 10 ms linear ramp. Each figure has two lines - one for 
each consonant intensity level, normal and double amplitude. 
Figures 4.1 to 4.3 depict the results for the consonant /b/; 
Figures 4.4 to 4.6 show the /d/ onsets. These 
ilustrations, for the most part; flat horizontal Lines, show 
that onset slope alone does not greatly affect a consonant's 
Recognition as sa Stop...) SbOps abe Still 1dentliied ac Stops 
when their bursts are removed. Even stops with linear i0 ms 
rises are perceived as Stops more than 80)percent of the 


time. All subjects categorized the consonant as a stop 100 
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PROPORTION OF 
STOP RESPONSES 
| on) 


© Normal C Amplitude 
4 C Amplitude x 2 


BURST NO BURST GRADUAL 


ONSET 
Figure 4.1 Experiment I /bi/ onset recognition curve, mean of 
all subjects. 
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Figure 4.2 Experiment I /bae/ onset recognition curve, mean of 
all subjects. 
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Figure 4.3 Experiment I /bo/ onset recognition curve, mean of 
all subjects. 
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Figure 4.4 Experiment I /di/ onset recognition curve, mean of 
all subjects. 
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Figure 4.5 Experiment I /dae/ onset recognition curve, mean of 
all subjects. 
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ONSET 
Figure 4.6 Experiment I /do/ onset recognition curve, mean of 
all subjects. 
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Percent Or the time wnder all three onset. condvtions when 
the vowel following was /i/. Only Subject 2 heard stops 
less than 80 percent of the time under these gradual onset 
conditions. 

While onset slope had only a small effect on the 
perception of stops, duration had a large effect. Figures 
4.7 to 4.12 show the mean responses over all subjects to 
Stim that Varied in duration, in Experiment im The first 
point on the abscissa, labeled "10 ms", is the same as the 
last point, labeled "gradual", on the abscissa of the onset 
figures, Figures 4.1 to 4.6. The response curves in these 
figures show a shift from stop. to fricative response 
occurring between approximately 18 and 30 ms in consonant 
duration. The slopes of the lines are not steep, implying 
that the changes in perception from stop to fricative do not 
happen at a specific duration. Rather, the stimuli with 
consonant durations near the manner boundary appear somewhat 
ambiguous in their manner status. Since subjects are shown 
to be significantly different in the statistical analysis 
(discussed in the next section), the individual subjects’ 
graphs in Appendix C should be scrutinized and compared. 
Scrutiny shows that individuals did have differing crossover 
Domes, some Subjects Giving fricative espolses atyecanlicr 


durations, some later. 
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o Normal C Amplitude 
a C Amplitude x 2 


STOP RESPONSES 
; (o) 


PROPORTION OF 


ime) ye) 30 40 50 60 
DURATION (ms) 
Figure 4.7 Experiment I /bi/ duration recognition curve, mean of 
all subjects. 
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Figure 4.8 Experiment I /bae/ duration recognition curve, mean of 
all subjects. 
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Figure 4.9 Experiment I /bo/ duration recognition curve, mean of 
all subjects. 
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o Normal C Amplitude 
a C Amplitude x 2 


STOP RESPONSES 
(o) 


PROPORTION OF 


me) 2,9) Ie) 40 50 60 
DURATION (ms) 
Figure 4.10 Experiment I /di/ duration recognition curve, mean of 
all subjects. 
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Figure 4.11 Experiment I /dae/ duration recognition curve, mean of 
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Figure 4.12 Experiment I /do/ duration recognition curve, mean of 
all subjects. 
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Confidence Ratings 

As well as identifying experimental stimuli, subjects 
indicated their confidence in the identification using a 
rating of 1 (low confidence), 2 (moderate confidence), or 3 
(high confidence). These confidence ratings provide 
evidence of the location of the subjects' stop/fricative 
boundaries, in addition to the evidence from the recognition 
curves. Table 4.3 lists the mean confidence ratings each 
Experiment I stimulus received. The mean confidence ratings 
POnethne soriginal consonants are high. 2.9 out of 34 The 
ratings decrease as the stimulus duration increases, 
reaching the lowest level at 20 ms consonant stimulus dura- 
tion. The mean confidence rating for the 20 ms Experiment I 
stimuli is 1.1. The 30 ms stimuli received the next lowest 
rating, 1.6. The ratings rise again as the duration of the 
Stimulus consonant further increases toward the 60 ms dura- 
BLOnateStlimuL sy. 

The decrease in the confidence ratings indicates the 
location of the subjects' stop/fricative boundaries because 
#he satings Show which stimuli subjects telt were 
ambiguous. Uncertainty about the manner of articulation of 
the consonant is the primary reason for assigning low confi- 
dence ratings. The ratings are lowest for consonant dura- 
fions of 20 and) 30 ms, which is “evidence that the boundary 
between stop and fricative durations is approximately 20 to 
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Table 4.3 Mean confidence ratings assigned to 
Experiment I stimuli. 


Stimulus Confidence Rating 
oul, doe” <6) 
Original Consonent 269 
Burstless 23 
10 ms 2G 
20 ms ee 
30 ms Ie 
40 ms 2750 
50 ms Zi 
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The confidence ratings also reflect how natural the 
stimuli sounded. Some subjects told the experimenter that 
they gave low ratings to unnatural sounding stimuli. 
Naturalness and confidence are closely linked since one 
cannot be entirely sure of the status of a consonant which 
does not sound like speech or which sounds like speech but 
has never been heard before. The confidence ratings give an 
indication of the quality of the stimuli as speech sounds, 
if they are interpreted as reflecting naturalness. The 
overall mean rating for Experiment I stimuli is 2.2. 
Subjects seemed reasonably confident of their responses and 


Satisfied that the stimuli were speech. 


Analysis of Variance 


Design 

An analysis of variance (ANOVA) was performed using the 
50 percent stop/fricative crossover points of the duration 
recognition curves of the subjects as input data. The 
crossover is the point along the stimulus consonant duration 
continuum at which the subject's responses reach 50 percent 
stop and fricative. At durations greater than this 
boundary, the stimuli were heard as fricatives; at shorter 
durations they were perceived as stops. The trials each 
subject performed were totalled into two groups - two 
replications, so each group in the repeated measures design 


had a large number of responses. 


dniite sdgioennn #28, enzede' < 
ud osaqe’ S¥FL Ehovae fois = je wath 
nm evip 26ni say eireb! Ios et “8 “se hd hs | 
@eniioe doseage 25 Piaghte sy ten tesiaup, ay 7 se ian -] 
say «adenlegutan ert mettae: aacbeisiquasnt ou. out 
5,£ et riomerse : snag 1agREE TS paions teen & = 
te ristahleos aikcnasads bewsse reer ue 


; re 


.nvesde per iftin? 2 ‘aig sad? ne 9s) 


Soa Baznoqess (72842 


acogzanl Se) 


ate paieu, fenrso2sed Hew (APIA) sonsiyen 19 cinethen 
revastuh say io aintog saveeee1. ov ogee: ope rosie ‘@ 
sav, .W76D suet 2% 2g2sfene ens te Sow Try wes 7a 
wei facil) shencends tenths te att pio Le, thee svt) bl #9 

eng0ne4 9° Toney Seehuge? 2 Se e2) ftiie o@.8 von ksma 
ziglz et] ah nae ane pesiua 4 al se0i28 ro 
“pevats $6 eevbeszivi 26 biséd, inv’ Linmida One ye 
Hees atalestent .2aqote Be Bawiseves, atav geile 
owl ~ BqQuolp ew2 <sai nadbedet Bros banssiseg ¢ 

apiaeh datutest bazeacdes .cit2 fit avday dves cas 0 


a 


i) 


The ANOVA was compiled using BMDP 8V in a four factor 
repeated measures design. The four factors are subjects 
(5), intensity (2 levels: normal and consonant amplitude 
Eimes tLwovee consonantse(2? labial andsdental)* and@vowels 
(3: /i/, /e/, /o/). Subjects and replications were treated 
as random effects; intensity, consonants, and vowels were 
regarded as fixed effects. The repeated measures were the 
two replication groups,.of subject trials. 

The analysis of variance sums of squares, F ratios, and 
probabilities of the factors in Experiment I are given in 
Table 4.4. A few factors and interactions were significant: 
Subsecesy (py 0001) vowels (p 6.01) = thescubject. by<con- 
SOnant interactiyon (p < .0001), the subject by vowel inter- 
action (p < .0001), the consonant by vowel interaction 
(p < .05), and the subject by consonant by vowel interaction 
(p < .0001). The highest level interaction, subject by con- 


sonant by vowel, is diagrammed in Figure 4.19. 


Proportion of Variance of Bach Factor 
The statistic w? (omega, squared) was calculated to 
estimate the proportion of total variance associated with 
each factor (Hays, 1963). aw" is a ratio of the estimated 
treatment variance to the estimated total variance. It can 
be calculated using the following formula: 
We = Op oe Oe 
where og 7 = (MS, - MS,)/n 
and o,* = (MSg + (n-1)MS,,)/n. 


The ratio indicates the approximate amount of variance which 
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Table 4.4 Analysis of variance for Experiment I. 
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each factor is responsible for. This statistic is presented 
in Table 4.5 for the subject, consonant, vowel, and inten- 
Sityetactons “The subject factor accounts Lor over half of 
the total variance, 74 percent. This is a very large 
portion of the total variance associated with a single 
factor. Along with the ANOVA results, the large subject 
Variance proportion shows that subjects are a very important 
factor in the experimental results. The next largest 
portion of the variance in Experiment I is associated with 
thesvowel factor, 18 percent. “The remaining tactors, conso— 
nants and intensity levels, account for 9 percent and 5 


percent of the variance respectively. 


Consonant Confusions 

As the present experiments were designed to induce 
stop-fricative changes in perception, consonant confusions 
in manner cannot be considered errors in identification. 
Confusions in place of artrveulation, however, are, true 
perception errors. These confusions involve labial conso- 
nants being perceived as dentals and vice versa. In 
Experiment I, the method of subject response did not allow 
for place confusions. But two subjects reported hearing the 
fricative /3/ when given the choice of /b/ or /v/ as 
response, i.e. when hearing a labial consonant. These two 
subjects indicated when this perception occurred so a small 
amount of consonant confusion data is available from 


Experiment I. The response /0/ to a labial occurred only 
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when the word "bat"/"vat" or "bee"/"vee" was presented, 
never when "boat"/"vote" was the stimulus, presumably 
because the word "“thote” does not occur in English. “Five 
percent of the responses to "bee" of these two subjects were 
"thee". Thirty-two percent of the responses to "bat" were 
"that". This limited Experiment I data seems to indicate 
that 19 percent of the "bat"'s and "bee"'s were perceived as 
beginning with dental consonants. If the responses of the 
other three subjects in this experiment are included, the 
ratio of labial/dental confusion in Experiment I drops to 


five percent. 


B. Experiment II 


Measurements 


Duration 

The durations of the Experiment II stimuli are given in 
Table 4.6. The durations of the stimuli with increased con- 
sonant amplitude are not included since the amplitude 
ineceaseudia not atLecs the duration. the first column) ain 
Table. 4.6 Shows the duration of the entire original 
Syllable, the second column shows the vowel duration, and 
the third the duration of the original fricative. The last 
three columns give the exact durations of the stimuli 
Consisting of three, two, amd one period of frication. ~The 
average durations of the labial and dental fricatives are 


identical, 114 ms. But the duration of the /v/ stimuli 
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Table 4.6 Durations of Experiment II stimuli. 
ee ee eee ee ee a . Fe ae ee 


Duration in ms 
ee ee ee Nn ee ee 


Original 3=Period 2=Perizoed I-Period 


Syllable Syllable Vowe 1 C C C C 
vee 360 274 86 29 YG) 9 
vat 526 412 114 2 iy 9 
vote 538 396 142 Zi) 18 9 
thee B37 239 118 Te) Wy 9 
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spans a much greater range. The standard deviation of the 
durations of the three /v/'s is 28. The standard deviation 


Of the three /d07 durations is 4. 


Intensity 

Table 4.7 shows the intensities of the Experiment II 
fricative-based stimuli in decibels, averaged over the three 
vowels. Each amplitude increase added 6 dB to the consonant 
intensity. The mean intensities for the original syllables 
are Similar, 41 dB for syllables starting with /v/, 42 dB 
for syllables starting with /d/. The dental fricative 
alone, at -23°€B; is 4 dB higher in intensity than the 
average /v/ at 19 dB. The original fricative consonants 
appear to be higher in intensity than the original stops 
(see Table 4.2), but the fricatives are approximately ten 
times as long as the stops. Ten milliseconds of frication 
CigedB tor /v7, l6°dB for. /0/) us lower an santencity than 
PnewoOriginalestops Ul6@wdB) fOre/D/sc0) fore d/h OLecnel. 


durations, stops are more intense than fricatives. 


Recognition Curves 

The mean of all subjects' responses in Experiment II 
are shown in the recognition curves in Figures 4.14 to 4.19. 
The ordinate shows the proportion of fricative responses. 
Although Experiment II subjects were able to respond by 
choosing any of the four consonants (/v/, /b/, /0/, or /d/), 
the responses are given in terms of the proportion of fric- 


ative responses. The /v/ and /0/ identifications were 
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Intensities of Experiment II stimuli (dB). 


Stimulus 
Original 10 ms 20 ms 30 ms 
19 13 5 16 
28 16 18 ie) 
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Figure 4.14 Experiment II /vi/ recognition curves, mean of 
all subjects. 


— 


@ 


PROPORTION OF 
FRICATIVE RESPONSES 
pn 


aw) 


ype Hoole tee ecco! 
10 20 30 14 


DURATION (ms) 


oO 


Figure 4.15 Experiment II /vae/ recognition curves, mean of 


all subjects. 
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all subjects. 
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Figure 4.17 Experiment II /dhi/ recognition curves, mean of 
all subjects. 
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Figure 4.18 Experiment II /dhae/ recognition curves, mean of 


all subjects. 
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pooled, as were the /b/ and /d/ identifications, to allow 
fricative versus stop response comparisons. Adding together 
the /v/ and /3/ responses hides errors of incorrect place 
identification. Consonant confusions will be considered 
later in this paper with respect to place of articulation. 

Inithese figures, 4.14*to.4:19; the 2original, “unaltered 
consonant is the rightmost point on the abscissa, labeled 
with the duration of the original consonant. Since 
Experiment II involved three intensity levels, normal, 
double, and quadruple consonant amplitude, these figures 
each have three lines. 

The graph lines show a consistent decrease in the 
proportion of fricative responses as duration decreases. 
The change in response from 50 percent fricative to 50 
percent stop tends to occur, for all consonant intensities, 
when the consonant is between 15 and 30 ms long. Almost all 
the individual subjects' curves, given in Appendix D, 
reflect this increase in stop responses with decrease in 
duration. The only exception is Subject 5, who shows no 
stop responses to the /den/ stimuli at normal or double 
amplitude levels. In a few other cases, subjects do not 
reach 50 percent stop response: Subject 4's /den/ and 


Subjects, 5S. /d0/. 


Confidence Ratings 
The Experiment II stimuli received relatively high 


confidence ratings. The overall mean rating was 2.7 out of 
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Som Unew generally high Scores tor these Gtimulisret lect the 
naturalness of the stimuli since subjects gave lower scores 
to stimuli which they felt were not speech-like, as well as 
to those which were ambiguous. Subjects were confident of 
their manner identifications of the original fricatives, 
giving them a mean confidence rating of 2.9. The ratings 
ape lower for the 30 ms stimuli, 2.6, lowest for the 20 ms 
Stim 2.5, oOndernise EOrethe 10 ms stimuli toes. oe (see 
Table 4.8). The lower 20 and 30 ms ratings indicate that 
the boundary between stop and fricative identifications 


occurs at approximately this duration. 


Analysis of Variance 


Design 

In Experiment II the input to the ANOVA were the 
stop/fricative crossover points. Although subjects 
responded by choosing a particular consonant in 
Experiment II, the ANOVA input points are indications of 
where the responses changed from fricative to stop, not from 
Fi) tOuy b/ OL 70y tO. /a7. Thevsix trials each subject: 
performed were totaled into two groups to create two 
replications each consisting of nine responses per stimulus. 

A few difficulties with the 50 percent crossover points 
were encountered. When the recognition curves did not reach 
50 percent stop response, if the last point on the 
recognition curve was close to 50 percent, the line from the 


second last point through the last one was extended to judge 
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Table 4.8 Mean confidence ratings assigned to 
Experiment II stimuli. 


Stimulus Confidence Ratings 
@lble, Yous Ss) 

Original Consonant DS) 
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the 50 percent crossover. This extrapolation was done three 
times for the Experiment II ANOVA input (out of 90 cases). 
In three other instances, when the curve did not near 50 
percent, the mean of all the other subjects’ response 
boundarves under = that condition was» inserted. "This was dene 
so that data would not be missing (especially since this 
data was not strictly speaking "missing"). Inserting the 
group mean of the condition does not increase the sum of 
Squares of that» factor. 

One other complication with the 50 percent crossover 
points occurred in Experiment II. This problem developed 
when the crossover fell between 30 ms duration of the conso- 
nant and the duration of the original consonant. This 
affected about 30 percent of the Experiment II data. If the 
50 percent crossover point were interpolated between the 
30 ms and the original duration, the crossover might fall at 
an unreasonably long duration because the original conso- 
nants' durations ranged from 85 up to 142 ms. Thus it was 
possible to "find" a 50 percent crossover point at 86 ms. 
The Experiment I results indicated that this duration was 
unreasonable since by 60 ms almost 100 percent of the 
Experiment I responses were fricatives. More precisely, the 
erossover from stop to fricative response should occur 
before 60 ms consonant duration. 

In order to resolve the dilemma of where to place the 
intermediate crossover point, two ANOVAS were calculated. 


The data for the first ANOVA assumed the crossover point 
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occurred between) 30 ms 7and.the oniginal fricative duration. 
The second data set used a more conservative estimate of the 
crossovers between 30 ms and original duration which 
involved assuming that responses would be 100 percent fric- 
ative when the consonant duration was 60 ms. So, the cross- 
over point was interpolated between 30 and 60 ms. This 
resulted) inecrossover points at shorter durations, durations 
closer to the Experiment I boundaries and to the rest of the 
Experiment II data. 

A comparison of the results of these two ANOVAS 
indicated that they were largely the same, although the sums 
of squares of the first were usually much larger than those 
of the second data set. Only one difference in significance 
of factors appeared: the vowel by intensity interaction of 
theetirnst ANOVAshadsa probability of 0.0407 (p < 305) - sin 
the second its probability was 0.0063 (p < .01). Otherwise 
all factors which had probabilities less than .01 in the 
first ANOVA also had low probabilities in the second. Since 
the second represented a more conservative and more 
reasonable estimate of the 50 percent crossover points, it 
was selected as the ANOVA to represent the Experiment II 
datay,s Thus = in Bxperiment I), sei the crossover fell eabove 
30 ms it was calculated as lying between 30 and 60 ms. 

The ANOVA wasS compiled using BMDP 8V in a four factor 
repeated measures design. The four factors are subjects 
(5), intensity (3 levels: normal, consonant amplitude times 


two, and consonant amplitude times three), consonants (2: 
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labial and dental), and vowels (3: /i/, /e/, /o/). Subjects 
and replications were treated as random effects; intensity, 
consonants, and vowels were regarded as fixed effects. The 
repeated measures were the two replication groups of subject 
et vou on 

The ANOVA of the Experiment II 50 percent crossover 
points showed most factors and interactions to have very low 
probabilities. Table 4.9 gives these ANOVA results. The 
Only “actomssignificantwatethe #05 leveljvandsnot atend1) 1s 
MOWelS;, Withva probability of 204: All-other gfactors nad 
probabilities of 01 or tess, except for the following: con- 
sonant, the consonant by intensity interaction, and the con- 
sonant by vowel by intensity interaction. The highest level 
interaction, subject by consonant by vowel by intensity 
(pe=9.01)) is graphed in Figures 4.20 and 4221) The first 
three graphs show the results involving the consonant /v/ at 
the three intensity levels; the second three graphs show the 


consonant /d/ at the three intensity levels. 


Proportion ot Variance Of Hach Factor 

The @- Statistic, an estimate of the proportion of 
variance of each factor relative to the total variance, was 
calculated for the Experiment II factors (see Experiment I 
results). Table 4.10 gives the results. The subjects 
accounted for 52 percent of the variance, over half the 
total variance. Intensity also had a major impact in 
Experiment II; it accounts for 30 percent of the variance. 


Vowels and consonants are the source of much less variation 
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Table 4,9 Analysis of variance for Experiment II. 


SOURCE 


MEAN 


SCVI 
R(SCVI) 


ERROR TERM 


S 
R(SCVI) 
SC 

SV 

SI 
R(SCVI) 
R(SCVI) 
SCV 
R(SCVI) 
SCI 

SVI 
R(SCVL) 
R(SCVI) 
R(SCVI) 
SCVI 
R(SCVI) 
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Experiment II, subject by consonant by vowel by 
intensity interaction — labial consonant. 
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ae fo) 
VOWEL 
Experiment II, subject by consonant by vowel by 
intensity interaction - dental consonant. 


Tablew4a  OstPropostion of Variance of Gach 
Experiment II factor 


Factor Gor Rank 
Subjects oS AlK (1) 
Consonant 047 (4) 
Vowel 2063 C3) 


Intensity 300 (2) 
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in the responses, 6 percent and 5 percent respectively. 


Consonant Confusions 

Experiment II provided more opportunities for consonant 
confusions. Subjects were allowed to respond with any of 
the four consonants, so it was possible to respond 
incorrectly with? respect to placerot arnticulationsor the 
consonant stimulus. Six percent of the responses to labial 
Stimuli were dental, 3 percent /d/ and 3 percent /3/ 
responses. Only one percent of the dental stimuli were 
heard as labials. This 1 percent consisted of incorrect /b/ 
responses; no /v/ responses to a dental consonant occurred. 
The 5 percent Experiment I and 6 percent Experiment II 
incorrect perceptions of labials and the 1 percent 
Experiment II incorrect perceptions of dentals accord with 
the consonant confusion studies mentioned in Chapter II. 
The place of articulation error rates in the present experi- 
ments are actually relatively low. The average rates, 
however, gloss over the subject variation - one subject in 
Experiment II identified 17 percent of the labial stimuli as 
dentalseeeThismsubj ect sthigh crror rate means=that=the 
place perception of the other subjects was even more 
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V. Discussion 


A. Measurements 


Duration 

The measurements of the original consonants in these 
experiments indicate they are within a normal range of fric- 
abive andestopedurationss  sThegdurations of tihhe wv %ss rirom 
86 to 142 ms, shown in Table 4.6 fall between Umeda's 78 ms 
(1977) and Abbs and Minifie's 153 ms mean (1969). Likewise, 
theso/ durations, from 110 to 118 ms long, are greater than 
Umeda's 52 ms and slightly less than Abbs and Minifie's 
123 ms mean. The stop consonant burst durations fall within 
one standard deviation of Klatt's (1975) measured durations 
iseeuTable 421)2) Thee/b/ meanw 7 ms ftande/d/ tnean, =12 ms, 


are typical for these consonants in English. 


Intensity 

The intensity meaSurements collected by Abbs and 
Minifie (1969), Malecot (1968), and Ohde and Stevens (1983) 
State that stops have higher intensities than fricatives and 
that dentals have greater intensity than labials. In this 
study the dentals do have greater intensities than the 
labialis) but the intensity of “the “stops “does "not appearato 
bergreater than sthat cfvthewinicatives. » The "invensitysor 
Vovjmusu20 cdbeand 9/67 is 23) dBve Thes/b/sintensity ers) 16ndB, 


while/v7 us d9edBeaerConsidering the ditierecicerimiduration 
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between=the’ stopsvand the) firicatives, though; sthe intensity 
OfPihe Srtops 1S qreaber, sitor 1isidaurati on. thanuthate ote the 
fricatives. The fricatives are approximately ten times as 
long as the stops. The intensity of the stimuli (derived 
from fricatives) that are the same duration as the stops is 
less than that of the original stop consonants (see Tables 
4.2 and 4.7). The measured intensities of the consonants 
used in the present study are as expected, the stops are 
more intense than the fricatives and the dentals are more 


intense than the labialis. 


B. Onset 

The lack of onset slope effect in the attempt to induce 
fricative responses from stop consonants in Experiment I 
shows that the difference between a stop and a fricative is 
not simply plus or minus burst. Removing the burst from a 
stop did not cause it to be heard as a fricative. While 
duratation of onset is considered to be a primary factor in 
Phemirs cCabive/atiricate. da stiunchionsiGerstman ,emOovuLeat 
cannot be in the stop-fricative distinction because of the 
duration factor. The duration of the onset slope stimuli 
whieh) were originally stops is about 10! mssoeThis 2s 7 too 
short for them to be identified as fricatives since fric- 
atives typically have durations of approximately 100 ms. 
This duration is too short even for affricate identification 
since affricates typically have rise times of 50 ms. So, 


although the consonants did not have abrupt onsets, they 
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were identified as stops due to their short durations. This 
result supports Van Heuven's (1979) claim that the primary 
CUGMUG fUhesiricalLive/atiricate distinction is not rise bime. 
as Gerstman states, but overall duration. Carden et 

al. (1980) have also claimed that the primary cue in the 
stop/fricative distinction is duration. The onset 
recognition curves for Experiment I show clearly that onset 
Slope alone does not distinguish between stops and fric- 


atives. 


GueDurat Lon 

Inebothethe stop) to imicative and fricative to stop 
identification experiments, the crossover in manner ident- 
ification tends to occur at approximately 25 ms consonant 
duration. The location of the boundary is apparent from 
both the subjects' recognition curves (see Figures 4.7 to 
4,12 and 4.14 to 4.19) and the confidence ratings associated 
with the stimuli (see Tables 4.3 and 4.8). Both the 
recognition curves and the confidence ratings show that the 
stop/fricative boundary of the subjects are at approximately 
20 to 30 ms consonant duration. Subjects' perceptions vary, 
depending on the consonant, vowel, and intensity involved, 
but identifications usually cross from 50 percent stop to 50 
Percent Hricative: within 15 ms of 250ms.8 This duration, 
25 ms plus or minus 15, seems to be a "magic" number in 
speech recognition. Many durational boundaries between 


Speech sounds occur at approximately 25 ms. For example, 
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Brebermant (1977) ®*showsvthatestopavorTr's ofsiessethan 25-ms 
are perceived as /b/ while those of more than 25 ms are 
heard as /p/. 

Other studies mention durations close to 25 ms as being 
enitical. 7 sbibermane(4956) sreportsithat,thesstop /b/eis 
identified as the semivowel /w/ when the duration of its 
transitions reach 40 ms. Another instance of a 40 ms 
boundary was noted by Cutting and Rosner (1974). They 
found, in both speech and non-speech stimuli, a "fixed" 
auditory sensitivity at 40 ms. Howell and Rosen (1983) 
dispute Cutting and Rosner's fixed auditory sensitivity 
elaiminguthat, 1tadoesmnot “correspond: tovthe frice 
ative/affricate rise time boundary which occurs at a dura- 
tion greater than 40 ms. Howell and Rosen support their 
argument by showing that natural affricates usually have 
rise times greater than 40 ms. Howell and Rosen, and 
Cutting and Rosner assume that "plucked" and "bowed" 
(Cutting and Rosner's non-speech identification terms) 
correspond to affricates and fricatives. Perhaps these 
terms apply to stops and fricatives and perhaps the 40 ms 
boundary does too. 

An argument against applying Cutting and Rosner's 
"pDlucked"/"bowed" boundary to stops and fricatives is that 
theirs is an duration of onset boundary while the stop-fric- 
atives boundary is, oneso! (duration, but durationgand onset 
work together. In natural speech fricatives, and in stimuli 


created for the present study, duration covaries with rise 
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time. Natural fricatives have roughly linear rises reaching 
fuldeamplitudetshortly before vowell!-onset. "Also, *Cutting 
and Rosner, Van Heuven (1979), and Gerstman (1957) all 
acknowledge that duration and rise time are much more potent 
together than either separately. In the present experiment, 
it was impossible to separate the two, using real speech, 
and still achieve natural sounding consonant stimuli. If 
stop consonants are concatenated to increase duration 
without being ramped to create gradual onsets, the results 
sound like mechanical buzzes, not like speech. Hence an 
increase in duration meant an increase in rise time. This 
also OGCursvinsthes production “of fricavives., Atrapid@rise 
time followed by a long steady time would be an affricate. 
Since Cutting and Rosner's 40 ms boundary does not apply 
well to affricates and stops, it may be more aptly applied 
to stops and fricatives, even though the 40 ms boundary is 
joams Longer @thansthe =stop-fricative boundary etoundehere. 
The variability of the subjects' crossover points 
around the 25 ms mark may be related to the sensitivity of 
the auditory system. Lieberman (1977) notes that a time 
delay of 20 ms is needed between two sounds for listeners to 
judge which came first. Lieberman suggests there is a 20 ms 
aug TEOny) Tecolutromeractory = *Klatteand) Goopene lai s)ereport 
that the just noticeable difference (JND) for duration 
change of the segments /i/ and /J/ is 25 ms or more. 
Fujisaki, Nakamura, and Imoto (1975) found the accuracy of 


discrimination for 100 ms of white noise is 9.1 ms. The 
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white noise is comparable in duration and content to a 
voiceless fricative. People can detect differences in dura- 
tion of 10 to 20 percent of the original speech segment 
duration. However, a just noticeable difference duration 
change in a consonant does not cause it to sound abnormal. 
Hugginsea (1972) (founds thatva word-initials/0/,tortginal ly 
1159ms tong, was judged’ "normal" “when its duration was 
between 83 and 162 ms. Subjects, then are not aware of 
slight differences in duration and are tolerant of fairly 
large changes in duration. The crossover variation within 
Subjects, associated with the consonant, vowel, and inten- 
Sity factors in the experimental results, may be due to the 
inability of the auditory mechanism to discriminate 
differences in noise duration and tolerance of duration 
differences. These possible sources of crossover variation 
may result in chance differences in boundary placement which 
would cause apparent shifts of a subject's labial/dental 
stop/fricative boundary. 

If the boundaries of individual subjects are varying 
randomly, for the reasons mentioned above, then 
Statistically significant interactions between subj ectsewill 
appear, just by chance. While the author does not believe 
thatoallethe response variation can®bewattributed to=chance 
differences in duration perception, the very low analysis of 
variance probabilities of the experimental factors and 
interactions are believed to be partly the result of a 


statistical system which discriminates more finely than the 
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human auditory perceptual system does. 

Production of stops and fricatives may explain boundary 
variation between subjects. The variation in listeners' 
boundaries is not important in distinguishing English voiced 
labial and dental stops and fricatives, because of the 
produced durations of these speech sounds. The mean dura- 
tion of the original, unaltered stop consonants recorded for 
these experiments is 9.5 ms; the mean fricative duration is 
114 ms. The difference between these stop and fricative 
durations is 104.5 ms. While it may be true that the dura- 
tion differences in production of these consonants are 
regulated by listeners' perceptual abilities, the important 
point here is that differences among crossover boundaries 
found in these experiments are irrelevant because of the 
durations produced in real speech. To verify the truth of 
this statement, consider the range of various boundaries 
found for various subjects in these experiments and consider 
the mean durations of actual labial and dental stops and 
fricatives. Stop/fricative boundaries in Experiment I 
ranged in duration from approxamately 10 ms 9 (for Subject (2's 
/be/ at normal amplitude) up to 50 ms (for Subject 3's /di/ 
apeinecreased amplituce). = (the dOsmswitcunesishan 
approximation because Subject 2 heard all the duration 
varying stimuli based on the syllable /be/ as fricatives 
more than 50 percent of the time.) If each of the 
originally recorded, real speech stops andcfricatives; 12 in 


all, were categorized based on the 150 boundaries found in 
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phemexperimentce(60SinsExperimentlpertrometivessubjects,a two 
consonants, three vowels, and two amplitude levels, plus 90 
Ingkxperimentellpetrometive  subyects, two consonants ss three 
vowels, and three amplitude levels), 98.8 percent of the 
identifications would be correct with respect to manner of 
articulation. All the fricatives would be correctly iden- 
triied; 878 of thes 900sstopyidentificationss(9776 percent) 
would be correct. Despite the variation due to subjects, 
consonants, vowels, and consonant amplitudes, the 50 percent 
stop/fricative crossover points used as manner recognition 
criteria result in correct categorizations of the naturally 
occurring stops and fricatives 99 percent of the time, based 
on duration alone. If abruptness of onset were also used as 
an identification criterion, all the stops would probably be 
correctly identified because of their initial bursts. 
Intensity could also be used as a Secondary characteristic 
on which judgements may be based. 

Clearly the variation in results between subjects and 
within subjects, associated with consonant, vowel, and 
amplitude levels are inconsequential in identification of 
actual stops and fricatives. Actually occurring English 
voiced labial and dental stops are shorter than almost all 
the 50 percent crossover boundary points. Actual English 
voiced labial and dental fricatives are longer than all the 
crossover boundary points. The produced durations of these 
stops and fricatives allow listeners a large amount of 


discretion in establishing a stop/fricative duration 
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boundary. Since listeners have no need to choose a certain, 
preci sesboundary "location, itiwould bessurpri singe Gethe 
distinguishing durational criteria of alli subjects were 
exactly the same. 

High similarity among subjects' duration boundaries 
mightesuggest a ‘fixed; innate auditory duration sensitivity. 
The variation found among subjects indicates that the 
boundary is not fixed across all listeners. Language 
variations in stop and fricative durations argue against an 
innate, fixed boundary in humans. For instance, Danish 
voiced labial and dental stops are longer than the 
comparable English stops, approximately 20 ms long 
(Fischer-Jorgenson, i954). Categorizations of Danish stops 
using the duration criteria found here would yield much 
higher error rates than the criteria would for English 
stops. Although Danish listeners could use other acoustic 
characteristics of the signal as verifying or supplementary 
Gbrtebia 1msStop VSe Ericativesidentificatvons, raising the 
lower limit of the duration boundary would enable them to 
judge manner of articulation solely by duration, as English 
Speakers cane) Thvsels Not prootsthat Danisheduration 
boundaries are higher, but they may well be, in which case 
the 25 ms duration boundary is not fixed and innate in human 
perceptual systems. 

Now that I have argued that the statistical differences 
associated with the experimental treatment factors are 


irrelevant, I will attempt to explain the boundary variation 
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heldved@co: the factors. Consistent variation, particularly, 
deserves consideration if we are to identify acoustic 
aspects; other than duration, which influence stop and fric- 


ative perception. 


D. Factors and Interactions 

The subject by consonant by vowel analysis of variance 
interaction of Experiment I (p < .001) means the stop/fric- 
ative boundaries in Experiment I cannot be discussed without 
considering subjects, consonants, and vowels simultaneously. 
Similarly, the subject by consonant by vowel by intensity 
interaction of the Experiment II ANOVA (p < .01) means 
Experiment II boundaries depend on all four factors. These 
interactions, because they involve so many factors, are 
complex. They can be described, as they appear in Figures 
4,13 (Experiment I) and 4.20 to 4.21 (Experiment II), but 
their complexity makes them difficult to explain. To 
Simplify the discussion, the factors are explained - as far 
as possible - one at a time in the following sections. 
Although the effects are considered individually, their 
interaction with other factors must be remembered and will 
be reiterated. And, although the causes for the variations 
in both experiments are discussed at the same time, the 


analyses of variance are separate. 
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Subjects 

Subject variation accounts for more than half of the 
variance in these experiments, 74 percent in Experiment I 
and 52 percent in Experiment II (see Tables 4.5 and 4.10). 
As discussed earlier, the variance due to subjects, although 
large, is irrelevant in processing real speech - as all the 
factor variance is - because of the great difference between 
ENeSCUralLtonCotestopeconsonants anduthersdurati1onseotstnic— 
ative consonants. 

In the discussion of duration, the sensitivity of the 
auditory mechanism was proposed as part of the reason for 
crossover variation within and between subjects. Individual 
crossovers varied over a range of 40 ms, from 10 to 50 ms 
consonant duration. The auditory system's JND for speech 
sounds is 10 to 20 percent of the original duration, which 
may account for some of this variation. Listeners are not 
accuratesin Gurationsdiscriminations, Ehusevariation, in 
duration crossover points occurs. 

Subject variation may also be related to the production 
of speech sounds. The wide difference in the duration of 
stop and fricative consonants allows subjects discretion in 
establishing a manner boundary. As well, segment duration 
VariesmaccOLvamnigutonlinguistictcontexta-i whether? cnesword 1s 
pronounced in isolation or running speech, where the word 
occurs within a sentence, whether or not the word is 
stressed, and so on. Listeners must cope with this 


variation by selecting an optimum duration boundary to 
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distinguish speech sounds, one immune to variation due to 
linguistic context. We should be surprised, because of the 
variety of sources of duration variation, and because of the 
differences among people in such things as exposure to 
speech patterns, if duration boundaries for all subjects 
occurred at exactly the same place. 

Between subject variation may simply reflect subjects 
adopting difference strategies for performing the experi- 
mental task. It is not possible to know exactly how 
subjects performed the experimental task. It is plausible 
that more than one method was used, even by a single 
subject. While subjects did not report on how they 
performed the task, they expressed little difficulty in 
making the identifications. Subjects did report variation 
within themselves in assigning the confidence ratings. 
Sometimes a Subject based the rating on confidence alone; 
other times the ratings were based on stimulus naturalness. 
Hopefully, since subjects did not find the identification 
taskrunusual onldifficult, *thevenitertagtheymused sin 
categorizing the consonants were ones used in everyday 
speech processing. 

Since subjects are one factor in the highest order 
interactions of both experiments, Subject variation is 
related to the consonant, vowel, and - in Experiment II - 
intensity of the consonant involved. An inspection of the 


individual subject recognition curves reveals this 


variation. 
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The between subject variation is evidence that people 
are different. People have been exposed to different 
auditory stimuli and people cope with these stimuli 
differently. The variation among people points to the 
effectiveness of the communication system. Because people 
are different and because individuals are not 100 percent 
aCCUPATE MIN discriminating acoustic (signals, the 
communication system depends on gross differences in the 
acoustic signals of speech sounds. The gross differences 
are necessary so people can easily identify speech sounds 
and then virtually ignore the auditory level of speech 
perception and concentrate on higher levels of perception. 
The difference in duration between stops and fricatives is 
one of these gross differences that facilitate speech sound 


1aenterviication. 


Intensity 

Intensity accounted for 4.5 percent of the variance in 
Experiment I and was not involved in the highest level 
interaction in that experiment. The recognition curves for 
subjects in Experiment I show a slight tendency for greater 
proportion of stop identifications at *higher consonant 
amplitudes when the stimulus duration is short, but this 
Gretenence micenOb statistically Significant. Ateshoresdura- 
tions, an increase in the amplitude of the stimulus®*causes 
itSintensity to increase towards that of a stop with a 


burst. For example, adding 6 dB (the amount added by the 
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amplitude increase) to the 20 ms /b/ stimulus increased its 
ImEensuryetoe l/ RdBAeeThe new intensity sisevenyeclosertorthat 
Obethe onrginaleyb/yel6 dhe MThistmay partlysaccounteeor the 
increased stop responses to the increased amplitude, short 
duration stimuli in Experiment I. 

In Experiment II, the intensity factor accounts for 30 
percent of the overall variance. The intensity effect shows 
up well on most of the subjects' recognition curves as 
higher proportion of stop responses at higher consonant 
amplitude levels. Intensity is involved in the fourth order 
Experimental? Interacr sone (pe) .01) mee thisencans sthesinten— 
Sity effect depends on the subject, consonant, and vowel 
involved. For example, Subject 4 shows variation due to 
anplibudestors/ve,> ande/ol/ebutenot for vi) 1770/67/02 4, mor 
/00/. Subject 2 shows noticeable increase in stop responses 
due to intensity for all syllables. The intensity increase 
effect is especially apparent at consonant durations of 20 
and 30 ms, durations near the fricative/stop boundary. At 
these durations, stimuli are somewhat ambiguous in their 
manner status, so their perception seems to be more 
Susceptible to the influence of other factors. (Ohde and 
Stevens (1983) also found that the intensity effect in iden- 
tifying stops as labial or dental was more pronounced for 
ambiguous stimuli.) 

The reason why intensity was highly variable in 
Experiment II but not in Experiment I may be associated with 


onset slope. All the Experiment I stimuli had gradual 
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onsets. The Experiment II stimuli onsets were not ramped to 
be gradual, so an amplitude increase could also cause an 
abrupt onset. However, the 20 and 30 ms stimuli, which 
showed the greatest increase in stop responses along with 
intensity increases, all begin with the first 10 ms of the 
fricative signal. This first period of a fricative is very 
low in amplitude and generally does not show abrupt changes 
in amplitude (see Figures 3.14 and 3.15). Thus the 
amplitude increase would not create an abrupt onset on these 
Stimuli. 

The most likely reason why an amplitude increase 
resulted in a stop response increase is that stops have 
higher intensities at short durations than fricative signals 
of short duration. The question of why Experiment I did not 


also show a significant intensity effect remains unanswered. 


Vowels 

The vowel factor accounts for 18 percent of the 
Experiment I variance and 6 percent of the Experiment II 
variance. Vowels were involved in the highest level 
interactions of both experiments. Figures 4.13, the graphs 
of the highest level Experiment I interaction, and Figures 
4.20 and 4.21, the highest level Experiment II factor 
interaction graphs, Show the response variation due to the 
vowel is irregular. When the consonant is dental, the 
stop/fricative crossover is lower with the vowel /z/ than 


with the vowels /i/ and /o/, which have approximately equal 
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crossovers. When the consonant is labial, the /x/ crossover 
is highest in Experiment II and the /i/ crossover is highest 
inebxperimentel. SAlso @theresislta’Parge amounteots variation 
among the subjects with, especially, the labial consonant 
ance the, vowell7a/ ie Thebitrreguiaricty inthe: boundaries 
associated with the vowels make the vowel effect difficult 
to explain. Three possible causes will be considered. One 
concerns the CV relationship, the second the frequency of 
OceuErence Of the CVeperr)  andethet thirds yeetherinfluencesot 
the particular stimulus word. 

One possible explanation of the vowel effect is conso- 
Nant-vowel coarticulation. Carney (1971) points out that 
coarticulation is greater when the tongue is not involved in 
the consonant production. When the tongue is not involved, 
it 1s free to assume the position of articulation of the 
following vowel, causing coarticulatory effects. Since the 
tongue is not used in /b/ production, but is in /d/ 
production, the vowel effect would be greater for /b/ than 
fore /d/.. Thus the greater variability tound’ in scrossovers 
of labial syllables than in dental syllables in both 
Experiment I and Experiment II may be due to consonant-vowel 
coarticulation. Precisely how perception is influenced by 
COarticulationl is not ecleare from this Seudy. 

The vowel effect may also be related to the frequency 
of occurrence in real speech of the vowels /i/, /x/, and /o/ 
with hee consonants 75/) /G/; 7/7.) andey c7 .w Denese (1963)) 


counted the number of times these and other vowels occurred 
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aiter these consonants in spoken, British English. 

(Actually Denes collected his data from phonetic readers for 
students of English as a second language.) Although the 
dialect of Englishvof the recorded words in his study 1s 
British, not Canadian, Denes' statistics may be helpful in 
interpreting the vowel and consonant factors. Table 5.1 
gives the percentage occurrence of each of the three vowels 
with the four consonants. The percentage is calculated as a 
proportion of the number of times a vowel occurred after a 
consonant divided by the total number of times any vowel 
occurred after that consonant. Table 5.1 also shows the 
rank order, based on frequency of occurrence, of each vowel 
after each consonant, in comparison with all 20 vowels Denes 
included in his study. The co-occurrences of the consonants 
with the vowels are variable. None of the vowels occurs 
consistently more often than any of the other two vowels 
with all the consonants, none occurs consistently least 
often. Perhaps the subject by consonant by vowel 
interactions are a result of the inconsistency of occurrence 
of these vowels with these consonants. Possibly some 
subjects considered the proportion of times /i/ occurs after 
Vp/ (920 percent), for example, in “comparison with /1/'s 
occurrence after /v/ (0.8 percent) in identifying a labial 
before /i/ as /b/ or /v/. It is also possible that subjects 
considered the CV co-occurrence frequencies as they appear 
in a dictionary - each word occurring only once, regardless 


of how often it is used in speech - in identifying 
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Table 5.1 Percentage frequency of occurrence, and rank order of 
occurrence frequency, of /i/, /z/, /o/ after /b/, /v/, 
/a/, /5/, out of twenty vowels. 


Vowel 
fey /x/ /o/ 

Consonant v5 (rank ) % (rank) % (rank) 
Whey} 9.0 (4) 6.0 (6) Zo (9) 
/v/ 0.8 (8) 13 (6) 1.0 (CW) 
VSY, ee (a) ORS (daz) WES AG (2) 
/3/ 0.6 (9) 8.0 (4) 5 Oe 
Mean Beal ay te) cal 


Note. Data taken from Denes (1963). 
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consonants. Subjects may have considered the CV 
co-occurrences sometimes, but not others. Or subjects may 
have ignored the co-occurrence relationships completely. We 
do not know what subjects did do, but CV co-occurrences 
could have influenced their responses. 

It is also possible that the vowel factor variation 
BEStS On ithe word involved, rather than thesaGv combination. 
Subjects may have been influenced by the particular words 
used aS stimuli. For example, "Dan", chosen to pair with 
the fricative-initial word "than", might not be considered a 
"real" (word in Englishvsinceyrt is a properinoun. )eSubjects 
may have tended to avoid reporting "Dan" as a response 
because of its uncommon word status. This could be the 
reason why the crossovers of the /de/ and /d#/ based stimuli 
are lower than the stimuli with the vowels /i/ or /o/. The 
lower crossover duration indicates fewer stop responses, 


which may be attributed to avoidance of the response "Dan". 


Consonant 

The consonant factor alone is not statistically 
Significant in either experiment, but in both these Eactor 
participates in the highest level ANOVA interaction. The 
variance associated with the consonant factor is quite low 
in both experiments: In Experiment I the consonant factor 
accounts for 9 percent of the total variance; in 
Experiment II it accounts for 5 percent of the total 


variance. So, although consonants do interact with other 
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factors, their effect is not as large as those of most of 
the other factors. In both experiments the consonant effect 
seems to be linked to the vowel effect. The same causes 
suggested for the vowel variation may account for the conso- 
nant variation. These causes are the consonant-vowel 
coarticulation effect, the CV co-occurrence frequency bias, 
and) the dislike-of-"Dan" hypothesis, The coarticulation 
effect suggests that CV coarticulation is more prevalent 
with consonants that are articulated without tongue 
involvement. Since /b/ and /v/ do not involve the tongue, 
whiles/d/ and) /0/*do,*the coarticulations of the former with 
the following vowels will be greater. Thus there is more 
response variation evident with the labial consonants than 
with the dentals. 

The variability of frequency of occurrence of vowels 
after the consonants in question may affect the consonant 
Fdentittcatvon mw Shi BEid6es, i tameansethatesubjects holdvott 
consonant identification until the following vowel is iden- 
tified. Although this is possible, it seems less likely 
than the reverse dependency. Nevertheless it may account 
for somesof the variation in the consonant recognitions. 

The dislike-of-"Dan" hypothesis postulates that 
subjects may avoid some responses because the response is 
not a "real" word. Two subjects in Experiment II obviously 
avoided the "Dan" response, since their recognition curves 
Of /dvawith this vowelado not* peaches0@percent= stop ident 


ification. Their avoidance of "Dan" may not have been due 
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to their belief that it was not a real word, however. They 
may not have responded "Dan" because they did not hear 
"Dan." (The recognition curves of these subjects, 4 and 5, 
are presented in Appendix D.) 

Although explanations have been provided for the 
presence of the consonant factor in the highest level ANOVA 
interactions, it seems that the stop-fricative continuum is 
independent of consonant place of articulation. Both the 
labials and dentals have boundaries at approximately the 
same durations. Both consonants respond similarly to 
changes in onset slope. In combination with the vowels, 
subjects, and amplitudes they behave differently, but this 


variation 1s relatively slight. 


E. Consonant Confusions 

Identifications of stops as fricatives and fricatives 
as stops have been induced in the present experiments. One 
might wonder if changes in place identification occurred 
along with the changes in manner identification. Carden 
found in his 1980 study that place identification depended, 
tousome extent. on) mannér identification. © In particular, 
Carden found the truncated voiceless dental fricative, /6/, 
Waseoftten identitred tas a labial §stopseeidentirfrcationsor 
dental fricatives as labial stops occurred only rarely in 
the present study. 

Garden et al. suggested the reason for the 


labial/dental stop/fricative dependency he found is the 
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existence of two perceptual boundaries between labials and 
dentals ="one sfor stops and one “for fricatives. The optimum 
boundary between the labial and dental stops occurs midway 
between the two as the boundary between the labial and 
dental fricatives falls midway placewise. However, the two 
boundaries do not coincide since the stops and fricatives 
Nave e(slightly) different placesof ranticulation. | eThe 
fricative /@/ falls on the labial side of the stop place 
boundary, although it is on the dental side of the fricative 
boundary. So when /@/ is identified as a stop, it is 
categorized as a labial. This hypothesis is implausible 
given: the difference in place of articulation of stops and 
fricatives. The difference is probably not sufficient to 
cause perceptible differences in the CV formant transitions. 
And the difference in place of articulation between /6/ and 
/o/, if any, would not be enough to place /@/ on the labial 
Side of the /b/-/d/ boundary and /d/ on the dental side. 

The results of the present experiments do not support the 
two boundary hypothesis. No significant errors in place 
identification occurred, beyond what was predicted by other 
consonant confusion studies. Subjects did not tend to 

ident siya short. Stopslike "dental “reicativesvas Jabralsstops. 
This identification occurred in only one percent of the 
responses. Changes in manner identification did not affect 
the place identification. The consonant confusions found by 
Carden and those found in the present experiments are more 


likely attributable tovacoustic factors in @thewsignals. | one 
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posstbletacoustierfactor ts lack@of burst. UGardenset 

al. propose, as an alternative to the two-boundary 
hypothesis; thatwlacksot iburstemay be a labialecueserThus 
truncated fricatives, having short noise signals with no 
burst, may be identified as labial stops. 

The /v/-/0/ confusions found in the present experiments 
are expected in light of the similarity of these two phones 
in intensity and duration and in light of the evidence of 
the prevalence of this confusion in other studies, such as 
Miller and Nicely (1955), Tolhurst (1954), and Wang and 
Bilger (1973). The labial-dental fricative confusion 
suggests that consonant-vowel formant transitions are not 
always sufficient for place identification. Labials, 
particularly; would ube subject tomnirsidentriication®*on the 
basis of formant transitions because the short oral cavity 
anterior “to the Pabialwplace “of articulation "might provide 
poor formant place information. On the basis of formant 
transitions, labials would be misperceived more often than 
dentals, as they are in these experiments. 

"Real" English words and CV frequency of occurrence 
Werepnoebabily other factors in the soccunrence sof "contusions, 
as discussed in the sections on the vowel and the consonant 
effects. Kent (1979) noticed a response bias, some conso- 
nants were substituted more often then others in cases of 
perception errors. The four most frequently substituted 
consonants in Kent's study were, in descending order, /p/, 


/b/, /£/, and /d/. These consonants were chosen over 100 


ay ' : . oe 


se whies .2e%bd, 33 Sa 
qi sbqued-ow2 sit ass 


awn Jaldst & OG 


eeqoita (aidpé ra fel tianskh?¢. @ 
25 S9t2 BF Hs ry epofaulr oD ey 
anend 7 \¥elLi64 mae sad td Sagkf al Su eqiy v 


“a 
f 


4 oy £5 ay fs ae. ai? si sdpet te. bn aciisiwh eae ea | wank, a, 
: j2 (@sihuste 7anze 1! inner =ifd, te anne - oda 
ef Baia, {eee ) 2a nw vere pltePy j ‘oh i. em 
sasanoc & ssid J631= att ST (era me 
F i 5f528 SARRIUE! Leh sy tf Gh anes sastt arent ue 
Tk ne bagel t(4q4ie “ene la ox! “aa atee 
tiene ret Jaehdbe. ac“Dinge 
7 £63 Mmard’ ath) S6ure sits tenegs ime 
3 cj oda TOSS ale le, sreig i:e., eis) Gr 9 
pnd, 10 REPRE Bas oh 1c frie Bgl y4 108g" we soa oe 
> ves’ Tor Fev agTeacr bay scl tua a ha rhat 49ers: 
dgnami2esen 2aSPs Ws. CU0 tady: ‘-e — 
sands 3556. TH wbiava tt vv ahe enbey, izi ten wh mS 
| ; is : 
2G Lsig OS, a6 ee Ow of ni #I0.33a° ‘kee i 7 eee: 
isserion o¢7 Bom plagine, tidy beeeiQh 19 oe nilom oa a it 
seng> sce ' ,.SHR eRuGgRS | & sere On Lacon ait a s 
io a4ynnp fi SIRAae ait? ved yee bd ee 
: 


beauy face onal gait we: ¢ 
=. 7 
vial Jenene y ah 
si fave ee nas 


— 


116 


times “each, constituting 57 percent of all the substitutions 
invKkent’sestudy. «This bias may also be in effect in these 
experiments, perhaps accounting for the high /d/ response to 
[7 VanC@D) CCU/ a, seelewiGeinteresting tosnote thaueenesfour 
consonants which appear as a response bias are not the most 
frequently appearing consonants in speech. According to 
Denes' (1963) data, /p/ is the fourteenth most frequently 
OCEULTINgGRCONSONantouEBOE 2440/67 woltweltthse/f7 eis 
friteenth, and /d/ is fourth.» Apparently the response bias 
effect is independent of consonant occurrence frequency in 
speech. 

Place confusions in consonant identifications were 
fairly low in the present studies, considering the findings 
of others. The amount of confusions varied, of course, 
according to subject. Subjects who made many errors in 
identification made most of these errors by responding /v/ 
to a /d/ stimulus. Favouring /v/ over /0/ may represent a 
bias against the consonant /d/ or against function word 
responses. (A few of the identification errors can also be 
counted as response recording errors. Making a mistake in 
responding was especially easy in Experiment II where 
Switches were pushed to indicate the choice of consonant.) 

Consonant confusions in place of articulation proved to 
be insignitveamt im the present experiments- | The 
infrequency of consonant place confusions indicates the 
stop-fricative continuum is independent of the place of 


articulation. Labial and dental consonants are both 
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affected by changes in duration, intensity, and onset slope. 
Any confusions that did occur can be explained 
Satisfactorily as functions of the acoustic properties of 
the signals. Dependency between the place and manner ident- 
ifications need not be solved by positing a 
listener-internal boundary, as Carden et al. (1980) do: 
place and manner identifications are dependent on acoustic 
enes=as) When thesacoustic *signalvof atlabialy ftortexample, 
resembles a dental, it is more likely to be identified as a 
dental. Likewise a dental which resembles a labial conso- 
nant will be heard as a labial. As well, fricatives which 
resemble stops acoustically, by being short in duration, 
will be recognized as stops. And stops which resemble fric- 
atives, by being long and burstless, will be categorized as 


fricatives. 
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Vi- Conclusions 


A. Improvements and Further Study 

Some modifications to this study are recommended in 
order to improve the experimental design and to increase the 
reliability of the results. One of these improvements is 
increasing the number of fricative duration conditions. As 
it was, there were only four durational stimuli made from 
fricatives: the original fricatives(85 to 142°ms long); the 
SOmms stimuli, othe’ 20°ms ones* "and the 10 °ms tones. @This 
range of durations did not adequately cover the stop/fric- 
ative duration boundary - 30 percent of the Experiment II 
boundaries fell between 30 ms and the original fricative 
duration. To enlarge the range to cover the entire boundary 
area, Stimuli should also have been made with durations of 
40, 50, and 60 ms. Then Experiment II would be exactly the 
same in terms of the number and durations of these stimuli 
as Experiment I, and the point at which the fricative/stop 
boundary falls could be judged more accurately. 

Intensity in Experiment I could also have been examined 
more effectively. The intensity measurements of the 
Gnigifal consonants=indicated that iricatives were less 
intense than stops. Taking this into consideration, the way 
to make Stops resemble fricatives would’ be to decrease, not 
inerease, the intensity of the stops.” Thismchange might 
have caused intensity to be as Statistically significant a 


factor in Experiment I as it was in Experiment II. 
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One more criticism of the design of these experiments 
might be made with respect to the naturalness of the 
presentation conditions. While investigation of specific 
acoustic factors cannot be performed in a completely 
natural, uncontrolled environment, the presentation of the 
Stimuli can be arranged to be as representative of naturally 
occurring situations as possible. In this respect, 
presentation of an auditory signal is better via a 
loudspeaker than through headphones since listening through 
headphones 1S an artificial situation. As well, the 
Stimulus items should be recorded and presented within 
Carrier phrases. Recording this way controls the intonation 
and stress on the key word and helps maintain a consistent, 
conversational rate of production. Presentation of the 
items within a sentence gives the words context. Listening 
to a sentence is much more meaningful than hearing isolated 
words. For the present experiments, all the stimulus words 
were recorded in carrier phrases. But they were presented 
in sentences in Experiment I only. In Experiment II the 
Stimuli were presented over a loudspeaker, while in I they 
were heard through headphones. Each experiment was natural 
in one of the two aspects of presentation, but neither in 
bothe 

Another consideration for experimental naturalness is 
the nature of the stimuli. In these experiments 
monosyllabic English words were used instead of nonsense 


syllables so that the stimuli would be meaningful. Because 
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of the unique status of the consonant /3/, the choice of 
words was limited, and it was impossible to choose all the 
words from the same lexical class. The voiced dental fric- 
aGUnve=OCeCUnSMWOrdminitiallyeonlyeinetunet1 om words ;enotcein 
major class words. Ideally all the words involved would 
have been, for example, nouns. This would reduce biases in 
subject responses and expand the carrier phrase 
possibilities used in recording and presentation. 

This study could also have been expanded to cover more 
of the acoustic aspects of stops and fricatives. For 
instance, onset could have been controlled more completely. 
In order to separate the duration changes from the onset 
Slope changes, duration could have been increased in the 
stop stimuli without removing the stop's abrupt onset. At 
longer durations thistsignal *mightesound@unnatural, or vit 
megntssound like ancsatiricate, or, ert emight sscunds like ia 
fricative. Another way to separate duration and onset slope 
is to cross the two factors so there are abrupt, burstless, 
and gradual onsets at every duration level. Again, some 
unnatural sounding stimuli might result, but allowing 
subjects to indicate which sounded unnatural would mark 
these segments. The crossing of the duration and onset 
slope factons would’ be ideal for testing the relationship 
between the two, seeing how each factor affects the other. 

Onset abruptness also needs further study. We need to 
know how small an amplitude increase is perceived as an 


abrupt onset. A simple experiment to test this would 
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involve increasing and decreasing the amplitude of a stop 
onset at the beginning of a signal. Changes in the duration 
Cppthesstrmuluecould be vartedsgim additrvonsetosonset 
amplitude, and the items might be categorized as stops or 
fricatives. This has been done only partially in the exper- 
iments here, only with gradual, rather than abrupt, onsets. 

Finally, we must consider the validity of the 50 
percent crossover point as an indication of the location of 
a boundary. Listeners may not have discrete boundaries they 
use to discriminate speech signals. Rather, they may have 
regions of uncertainty. For instance, consonants with 
durations anywhere between 20 and 30 ms may be difficult for 
subjects to classify since the consonants’ durations are 
such that subjects cannot be certain about the stop/fric- 
ative status of the consonant. The 50 percent crossover 
point, then, indicates the approximate location of the 
region within which subjects are uncertain of their 


responses. 


B. Summary and Conclusions 

In summary, the primary acoustic feature which 
listeners use to discriminate English voiced labial and 
dentauestops and frieatives ts duration. @invonesexper ment 
inevhisastudyethe duration sof tthe morsemof Gstops iwas 
increased: in the other experiment, the duration of the 
noise portion of fricatives was decreased. Both the exper- 


iments indicated that listeners shift from /b/ and /d/ to 
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/v/ and /8/, or vice versa, when the consonant duration is 
approximately 25 ms. This manner boundary varies 
considerably according ‘to the subject» consonant place, “and 
vowel involved. In the second experiment, the amplitude of 
the consonant also interacted in the boundary variation. 
The variation in both experiments, although statistically 
Signiricant, 1S inconsequential since all the boundaries 
found would result in correct categorization of actually 
occuring fricatives and stops. 

The effect of the amplitude of the consonant in the 
fricative-based experiment was one of increasing stop 
responses with increased amplitude. Since stop consonants 
are higher in intensity than fricatives, this effect was 
predictable. 

In one experiment, the explosive onset of the stops was 
removed. Eighty percent of the resulting stimuli, burstless 
and gradual onset stops, were recognized as stops by 
subjects. Presumably, the short duration of the stops 
overcomes the lack of explosion. Although onset slope does 
not influence manner recognition of normal duration stops, 
we would expect that a stop burst followed by a longer dura- 
tion of noise would be identified as an affricate. In 
English, however, there are no labial or dental affricates. 
Duration, therefore, is a sufficient cue to distinguish 
stops and fricatives at these places of articulation. 

Duration is a more robust manner cue than abruptness of 


onset or intensity. The latter could easily be washed out 
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by extraneous noise. Yet, while the content of a Signal 
might not be perceived, its duration could still be 
estimated. Duration, then, is a reliable cue for Stop/iric= 
ative identification. While onset Slope and intensity may 
influence manner identification, the primary perceptual 


difference between a fricative anda stop is duration. 
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VII. Appendix A: Alligator Programs 


A. "Randomize" 

C THIS IS A PROGRAM TO RANDOMIZE 

C A LIST OF SEGMENT NAMES. 

C THE SEGMENT NAMES ARE IN A SOURCE 

C FILE CALLED SEGNAMES. 

C THE RANDOMIZED LIST GOES INTO 

C THE SINK FILE CALLED RAND. 

€ 

CLEAR VI 

C THE ARRAY VALUE IN VARLIST IS THE 

C NUMBER OF SEGMENTS TO BE RANDOMIZED 
DATA VARLIST(28)*8 

DATA Mi 

C 

SOURCE SEGNAMES 

LABEL 1 

READ *SOURCE &VARLIST(&I1) 

RODESI | 

Perel 10.29 GOTO: 2 

GOTO 1 

c 

C RANDOMI ZING 

BABEL=2 

EM DO:RAND 

SINK DO:RAND 

C THE FIRST PARAMETER AFTER *SINK IN THE 
C RAND COMMAND IS THE SEED NUMBER 

C THE SECOND IS THE NUMBER OF REPLICATIONS, 
C AND THE THIRD IS THE LIST TO BE RANDOMIZED. 
RAND *SINK 1000 3 &VARLIST 

REL *SINK 

END 
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B. "Playrand" 

C A PROGRAM TO PLAY BACK A 
C LIST OF SEGMENTS, WHETHER 
C RANDOMIZED OR NOT. 
DATA I 1 

DATA SEG*8 

DATA FILE*6 

SOURCE SFNAME 

LABEL 1 

CL WA 

READ *SOURCE &FILE &SEG 
GET D1:&FILE 

LQ &SEG 

P 

WAIT 1 SEC 

IF &I1 EQ 10 GOTO 2 

ADD &I 1 

GOTG: | 

LABEL 2 

WALT =3*SEC 

SU*SL FS 

GOTO 1 

END 
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C. "Measure" 

C A PROGRAM TO MEASURE A 

C SIGNAL'S DURATION, PEAK, 

C AND AREA, 

C THE SEGMENT NAMES ARE IN 

C A FILE CALLED SEGNAMES, 

C BUT THE SEGMENTS THEMSELVES ARE 
GAIN ATFILE CALLED SEGPIL. 


DATA SEG*8 

SOURCE D1:SEGNAMES 
GEITD IR SEGRE IL 

LABEL 1 

CL WA 

READ *SOURCE &SEG 

L &SEG 

MEAS DUR &X &Y MSEC 
PRINT &SEG 

PRINT DURATION= &xK &Y 
MEAS PEAK &X &Y 
PRINT PEAK= && &Y 
MEAS AREA &X& &Y 
IFERROR 50) PRINT 50 
PRINT AREA= &XK &Y 
GOTO? 1 

END 
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D. "Derand" 

C A PROGRAM TO DERANDOMIZE A LIST 
DATARS £0 

DA Ae 

DATA CR 

DATA K 

DATA RESP 

DATA STIMNO 

DATA FILE*6 

DATA FILE2*6 

WRITE *TTY NAME OF FILE TO BE DERANDOMI ZED? 
READ *TTY &FILE2 

WRITE *TTY NAME OF SINK FILE? 
READ ¥TTY “SFILE 

WRITE *TTY STARTING AT LINE? 
READ #TTY &K 

CRE D1:&FILE SIZE=3 

SINK D1:&FILE 

LABEL 2 

LET SAS NaC 

SOURCE D1: &FILE2(1) (&K) 

LABEL 1 

ADDE&W 1 

C IN THIS INSTANCE, THE NUMBER OF ITEMS 
C IN A REPLICATION IS 96. 
LEPKI EO 97 GOTO 2 

READ *SOURCE &STIMNO &RESP 

IF &STIMNO NE &1I GOTO 1 

WRITE *SINK &RESP 

ADD F&l “1 

Leech. oT S7 GOTO | 

END 
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E. "Switchbox" 

C A PROGRAM TO PLAY THE SIGNALS 
C FOR A CATEGORIZATION EXPERIMENT 
C AND READ THE RESPONSES MADE ON A 
C TOUCHPAD RESPONSE BOX. 

C SOURCE FILE MUST CONTAIN: 

C 1) RANDOMIZATION NUMBERS 

C 2) NAME OF FILE WHERE THE SEGMENT IS STORED 
C 3) NAME OF SEGMENT. 

C THE SEGMENT NAMES ARE ALREADY IN RANDOM ORDER 
SINK DISDATFIL 

SOURCE D2:SOUFIL 

CL AL 

DATA SEG*8 

DATA FILE*6 

DATAGMASIOGI2) #1 §25 346556 7-629 710 
ee 

DATASAV.( 12) 

DATA MAXSW 12 

DATA RESP 

DATA SNUM 

DATA I 

DATA J 

DATA CNT 

LABEL 2 

ADD &I 1 

READ *SOURCE &SNUM &FILE &SEG 
IFERROR 43 GOTO 6 

GET D1<SFILE 

LQ &SEG 

LABEL 3 

P 

WAIT 50 MS 

PULSE 5 512 5 MS 

WAIT 20 MS 

PULSE 5 0 5 MS 

RDSW XL: 1 &MASK &AV 

PULSE 5 512 50 MS 

PULSE 5 0 5 MS 

Ee Teco a 

LETACCNT 0 

LABEL 4 

IE SAV GS) HO 4095 GOTO 5 

LET &RESP &J 

ADD &CNT 1 

LABEL 5 

ADD Sean 

IF &J LE &MAXSW GOTO 4 

if SENT NEWT GOTO. 3 

IF &RESP GT &MAXSW GOTO 3 

WAIT 200 MS 

WRITE *SINK &SNUM &SEG &RESP 
CLWA 

GOTO 2 
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LABEL 6 
REL 
END 
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"Swscore" 
A PROGRAM TO DERANDOMIZE AND 
AND REORDER INTO A RESPONSE 
ARRAY THE RESPONSES GIVEN TO A 
CATEGORIZATION EXPERIMENT USING 
A TOUCHPAD RESPONSE BOX. 

(SEE "SWITCHBOX" PROGRAM. ) 


OOO @r@ enti 


DATA SNUM 
DATA RESP(12) 
DATA’ SEG*¥8 
DATA R 

DATA K 


DATA SOUF(4)*6 SUB11 SUB12 SUB21 SUB31 
DATA®SINKE (4) 46 °Si'1 S12) S21 S634 

LET &M O 

LABEL 9 

ADD &M 1 

IF &M EQ 5 GOTO 7 

CRE D2:&SINKF(&M) SIZE=11 

SINK D2:&SINKF(&M) 

BET co 10 

LET Sch 0 

LABEL 1 

SOURCE D2:&SOUF(&M) 

ADD &J 1 

C IN THIS INSTANCE THE NUMBER OF ITEMS 

C IN A REPLICATION IS 72 

Thee eEO: 73 4GOTO 5 

EET <1. 0 

BABE 2 

ADD &I 1 

C IN THIS INSTANCE THE NUMBER OF REPLICATIONS 
CAINVA RUN S53: 

IF £1 EO 4 GOTO 4 

LABEL 3 

READ *SOURCE &SNUM &SEG &R 

IF &SNUM NE &J GOTO 3 
IF &R EQ 1 ADD &RESP( 


1 
IF &R EQ 5 ADD &RESP( 1 
IF &R EQ 9 ADD &RESP( 1 
IF &R EQ 2 ADD &RESP( 1 
IF &R EQ 6 ADD &RESP( 1 


IF &R EQ 3 ADD &RESP( 
IF &R EQ 7 ADD &RESP( 
IF &R EQ 11 ADD &RESP 
IF &R EQ 4 ADD &RESP( 
IF &R EQ 8 ADD &RESP( 
IF &R EQ 12 ADD &RESP 
GOTO 2 

LABEL 4 


1 
2 
3 
4 
5 
IF &R EQ 10 ADD &RESP( 
7 
8 
( 
1 
1 
( 
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WRITE *SINK &SNUM &SEG 

WRITE *SINK &RESP 

C THE FOLLOWING LOOP RESETS THE 
C VALUES IN THE RESPONSE ARRAY TO ZERO. 
LET &K 1 

LABEL 6 

LET &RESP(&K) 0 

ADD &K 1 

TF ek LT 13 .GOTO6 

GOTO | 

LABEL 5 

GOTO? 

LABEL 7 

REL 

END 


VIII. Appendix B: Experiment I Response Sheets 


Janet Schwegel's Experiment Name: Date: 

83/08 

Please circle the word heard at the end of the sentence "Please say the 
word ." Also, indicate your confidence in your answer, on a scale 


of one to three, by circling the appropriate number. 


The first ten are practice sentences. There is a pause after every tenth. 


Seale: 1 (not very sure) 2 (fairly sure) 3 (very sure) 
a bat vat iL 2B 8 21 bee vee sD yeaa e) 
b though dough 2, S38 22 thee dee Lt 2 8 
cy dough though thy ee 23 bee vee kL @ sg 
ad thee cle (Up yal 2 gs DA vae bat 2S 
e bee wae (w"yal 2 s 25 dee thee aS 
7 (WENE bat Ls 26 vote boat ee 
g dee thee ik as) 27 bee vee PAD eS) 
h than Dan 12S 28 vote boat ie 2 3 
Te boat vote G2 eS 29 dough eloveybieigy 3b BQ 3B 
j vee bee i eS 30 vat bat core 
ie Sat vat 1 ge 2 Sle bat vat 1 2 3 
2 than Dan i oy 8 32 though dough lL 2B 3 
3. dough though Iie Ear» ez) 33 dough elaxeybtepfol il 2 & 
4 though dough tn 2 3B 34 thee dee oe 3 
See bia vat t BQ Bs 35 bee vee 52> 3 
6 though dough thy A 36 vat bat ee es 
7 dough though he 2. 78} 37 dee thee ox Ss} 
8 though dough 2: eS 38 than Dan J. 32. 3 
9 dee thee DED Aes oe 39 boat vote ae 
10 thee dee we 3 40 vee bee ihe 8 
ll dee thee 23 4l bat vat I aes 
12 vee bee 12 aS 42 thee dee ut 2B 3B 
ILS} Joyeve vat k 2B 3 43 boat vote 1 Bs 
14 thee dee tu BZ 3 44 thee dee lL A & 
15 boat vote Jer 2 3 45 bee vee 1 2 Bg 
16 vote boat 2 3 46 vat bat 1G eet 
17 Dan than 1 eZee 47 dee thee 2s 
18 though dough 12 3 48 vee bee ab DO ekS 6s) 
19 Dan than a2 3 49 dee thee iG ee 
20 though dough 2s 50 vee bee es 
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